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Abstract 

Jet Substructure at the Large Hadron Collider: 
Harder, Better, Faster, Stronger 

Christopher K. Vermihon 

Chair of the Supervisory Committee: 
Professor Stephen D. EUis 
Department of Physics 

I explore many aspects of jet substructure at the Large Hadron Collider, ranging 
from theoretical techniques for jet calculations, to phenomenological tools for better 
searches with jets, to software for implementing and comparing such tools. I begin 
with an application of soft-coUinear effective theory, an effective theory of QCD ap- 
plied to high-energy quarks and gluons. This material is taken from [Ij, in which we 
demonstrate factorization and logarithmic rcsummation for a certain class of observ- 
ables in electron-positron collisions. I then explore various phenomenological aspects 
of jet substructure in simulated events. After observing numerous features of jets at 
hadron colliders, I describe a method — jet pruning — for improving searches for 
heavy particles that decay to one or more jets. This material is a greatly expanded 
version of [2\. Finally, I give an overview of the software tools available for these kinds 
of studies, with a focus on SpartyJet, a package for implementing and comparing 
jet-based analyses I have collaborated on. Several detailed calculations and software 
examples are given in the appendices. Sections with no new content are italic in the 
Table of Contents. 
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GLOSSARY 



ASYMPTOTIC FREEDOM: The property of QCD that the strong couphng is weak 
at high energies. This means that high-energy processes can be calculated per- 
turbatively, and that partons within hadrons will appear as weakly-bound con- 
stituents if probed at high energy. 

COLOR: The QCD charge, analogous to electric charge. Quarks carry one of three 
fundamental colors (red, green, and blue); gluons can be thought of as carrying 
a color-anticolor pair, such as (green-antired) . Particles that do not carry color 
charge are color singlets. 

CONFINEMENT: The reverse of asymptotic freedom: at low energies (below Aqcd ~ 
200 MeV) quarks and gluons are bound into hadrons because the strength of 
the coupling in this regime. 

FACTORIZATION: The division of a physics process into subprocesses which can be 
calculated separately, and which typically depend on fewer energy scales than 
the full process. A final cross section can then be expressed as a product of 
several functions, each of which depends on a subset of the relevant variables 
characterizing the event. For example, factorization is what allows the non- 
perturbative evolution of incoming protons, and the likelihood to find a parton 
of given momentum in them, to be treated separately from the perturbative 
hard scattering. 

FINAL-STATE RADIATION (FSR): Radiation from outgoing particles produced in 



X 



the hard scattering. 

HADRON: A bound state of a quark and an antiquark (meson) or of three quark- 
s/antiquarks (baryon). Hadrons are the relevant particles in QCD at low ener- 
gies (compared to Aqcd ~ 200 MeV) . 

HARD SCATTERING: The central high-energy process at a hadron collider, where 
two quarks or gluons collide to produce 2 or more other high-energy particles. 
The outgoing particles then typically decay to produce the particles seen in the 
detector. The hard scattering is to be contrasted to the subsequent final-state 
radiation, previous initial-state radiation, and the underlying event. 

INITIAL-STATE RADIATION (ISR): Radiation from incoming particles in the hard 
scattering. 

JET: A mostly coUimated spray of hadrons produce by the showering of one or 
more quarks or gluons at a particle collider. 

JET ALGORITHM: A procedure for constructing jets from initial objects such as 
particles or calorimeter cells. 

LEADING LOG: A cross section is correct to leading logarithm accuracy if it in- 
cludes all terms of order a^L^**, where L is some large logarithm. See Sec. 2.2.3, 

MONTE CARLO X: X uses, or was produced using, random numbers, as in a "Monte 
Carlo event generator" or a "Monte Carlo data set" . Refers to the famous casino 
in Monaco. 

PARTON: A quark or gluon. The term comes from the "parton model", a phe- 
nomenological model of the strong interaction that predates QCD. 

xi 



PARTON SHOWER: The process whereby a high-energy quark or gluon repeatedly 
radiates soft and coUinear gluons, which subsequently radiate themselves, pro- 
ducing a multiplicity of low-energy partons, which will later hadronize. A "par- 
ton shower Monte Carlo" such as Pythia is a program that simulates this 
process, typically only including the leading-log portion of the gluon emission 
matrix element (i.e., the double soft/coUinear singularity). 

PERFECT STRANGERS: An American sitcom that ran from 1986 to 1993 on ABC. 
It "chronicles the rocky coexistence of Larry Appleton (Mark Linn-Baker) and 

his distant cousin Balki Bartokomous (Bronson Pinchot)" [3j. Notable for pro- 
ducing the spin-off series Family Matters in 1989. 

PILE-UP (PU): The effect of multiple proton collisions occurring at once at the 
LHC. The expected energy exchanged in a proton coUision has a sharply faUing 
distribution, so most pilc-up interactions are much less energetic than the main 
interaction, which is selected to have very large momentum transfer (e.g., hav- 
ing several high-p^ jets). Pile-up collisions are completely independent of the 
principal interaction. 

QCD JET: In contrast to a "heavy particle jet", which includes the shower from 
multiple quarks and/or gluons, which were produced in the decay of a massive 
particle. A QCD jet includes the shower of one or more partons from the hard 
scattering, but has no intrinsic mass scale. 

SPLASH-IN/SPLASH-OUT: Splash-in is radiation included in a jet that did not come 
from the showering of the initial parton(s). Splash-out is the reverse: radiation 
that came from the initial parton(s) but is not included in that jet. Note that 
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after hadronization splash-in and splash-out cannot be unambiguously defined 
unless the initial partons form a color singlet. 

UNDERLYING EVENT (UE): The combined effect of beam remnants and their po- 
tential multiple interactions. Beam remnants are what remains of the colliding 
protons after one parton each is involved in the hard scattering. At minimum, 
they must combine with other parts of the events to create color singlet hadrons 
for the final state. The beam remnants can also produce secondary collisions, 
known as multiple parton interactions (MPI). This typically produces additional 
low-pT jets in the event as well as soft radiation throughout the detector. The 
underlying event is approximately independent of the hard scattering, but is 
typically color-connected and thus impossible to separate completely. 
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HAD RON COLLIDER VARIABLES: Kinematic variables at a hadron collider are 
chosen to have simple behavior under Lorentz boosts along the beam axis, since 
the center-of-momentum frame of the initial parton collision is only known up 
to such a boost. Here are the most important variables: 

(f) Azimuthal angle about the beam axis. 

y Rapidity, y = ^ ( g-p"^ )- Under a longitudinal boost 7 = coshy;,, y 
y + Vb- 

T) Pseudorapidity, equal to y for massless particles; maps directly to polar 
angle: rj = — lntan(^/2). 

Pt Momentum transverse to the beam axis, Pt = pI + P^- 

AR(pi,p2) Longitudinal boost-invarant angle between two particles: Ait!^(pi,p2) = 

(01 - 02)' + {yi - y2f. 

z{pi,p2) Minimum transverse momentum fraction for a merging/splitting from/to 
pi and P2: z = ) ^ where pi+2 = Pi +P2- 

Pl+2 

NOTE: z and AR are useful in describing the twin soft and coUinear singularities 
of QCD radiation. An emission with small z is soft; an emission with small 
AR is coUinear. 

NOTE ON CONVENTIONS: In this thesis I attempt to follow the conventions of 
Peskin and Schroeder [4J where possible. In particular, this includes a "West 
Coast metric", g = diag(l, —1, —1, —1), so = for an on-shell particle of 
mass m. The gamma matrices 7^^ are defined in the Weyl basis, 

"Natural" units, where h — c—1, are used throughout. 
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Chapter 1 
INTRODUCTION 

At the dawn of the LHC era, the prospects for high-energy particle physics are 
bright. The Large Hadron CoUider [5, 6j will almost certainly resolve some of the 
outstanding questions in particle physics. How is electroweak symmetry, central to 
the remarkably successful Standard Model (SM), broken — as it necessarily must be? 
What is the nature of dark matter, which makes up a quarter of the mass of the 
universe? Why is the Planck mass, the only "natural" scale in the universe, so much 
bigger than everything else? And why are there so many particles in the Standard 
Model, anyway? The possibilities for new discoveries are endless. 

And yet — prospects for easy discovery are bleak. Almost any physical effect 
observable at the LHC will require either deeply sophisticated analysis techniques, 
patient accumulation of vast statistics, or both. A quick and easy discovery, with 
a few exceptions [7j, would already have been made at earlier experiments at the 
Tevatron [8, 9] or LEP [10]. 

The chief difficulty in discovering new physics at the LHC is that new particles 
created will almost certainly exist for times much too short to ever interact directly 
with the detectors surrounding the point of collision. Most new particle searches, then, 
are concerned with observing decay products. To be observed in the detector, these 
decay products must be stable enough to get there and interact strongly enough to 
be noticed. These two requirements ensure that the most likely candidates are simply 
SM particles; certainly anything that can be produced by the collision of protons must 
be able to decay back to SM particles. The unfortunate upshot is that the signature of 
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new physics — the signal we want to see — will almost necessarily have a substantial 
overlap with the signatures of well-known SM processes that can produce the same 
decay products. 

One of the most difficult types of signals will involve decays to quarks and gluons, 
which are subsequently observed as the phenomena known as jets. The SM cross 
sections for basic processes involving jets, even when a W or Z boson is involved, 
typically dwarf any new physics signals with similar signature. The common su- 
persymmetric signature of a lepton, jets, and missing energy is easily faked by the 
H^+jets background. To have any hope of extracting these kinds of signals, we will 
need to advance our understanding and usage of jets. 

Fortunately, many such advances have been made in recent years. Theoretical ad- 
vances have extended the precision with which we can predict cross sections involving 
jets, both through brute force calculations to higher order in perturbation theory and 
through new effective theories that make these calculations more tractable. In the 
latter category, soft/coUinear effective theory (SCET) [11, 12, 13, 14J has shown great 
potential to improve our ability to calculate jet-based observables by factorizing the 
relevant calculations into separate pieces involving single energy scales. These im- 
proved theory tools will help us to better characterize the backgrounds to interesting 
new signals. 

Another theoretical development in the run-up to the LHC has been increased 
interest in jet substructure. Whereas jets at previous experiments were typically 
thought of as corresponding to a single initial quark or gluon, this will not always be a 
good model at the LHC. In particular, if heavy particles that decay to multiple quarks 
or gluons are highly boosted, the jets corresponding to the multiple decay products 
will move closer together and eventually appear as a single jet. For example, while a 
top quark decaying t — > Wb — > udb would be observed as three jets at the Tevatron, 
it is now common to imagine "top jets" in LHC analyses. Finding these jets, as well 
as the single jets arising from decays of new particles, requires a new way of thinking 
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about jets. A jet corresponding to a top quark can be expected to have a mass, as 
well as substructure related to the two-step decay t — > Wb — > qq'b. Separating top 
jets from their QCD doppelgangers requires understanding the substructure of both 
kinds of jets. 

Beyond understanding the physics of parton showers and decays, we must consider 
the experimental environment in which we observe these phenomena. The LHC will be 
a phenomenally noisy experiment. We must account for radiation from the incoming 
protons, the interactions of the "beam remnants" (components of the protons not 
involved in the largest-energy collision), and even the effect of more than one pair 
of protons colliding at once. All of these are sources of additional radiation in LHC 
events, and hence will contribute to the characteristics of observed jets. An important 
development in the last few years has been the variety of ideas related to "filtering" 
jets to remove many of these contributions [15, 16, 17, 18, 19, 2, 20j. 

As the theoretical tools to find, measure, and modify jets proliferate, the need for 
software to easily implement them grows. The Fast Jet package [21, 22j has provided 
efficient implementations of nearly all common jet algorithms, as well as facilities 
for user-defined plugins and tools. More recently, the SpartyJet package [23, 24j 
has emerged as an analysis package that extends the capabilities of Fast Jet with 
support for a variety of input and output methods, simple chains of jet measurement 
and modification tools, and an increasingly powerful graphical interface for quickly 
comparing and exploring different analyses. 

This thesis is divided into four main sections. Chapter 2 provides background 
to the rest of the thesis, surveying QCD, effective theories of QCD like SCET, and 
basic jet physics. Chapter 3 demonstrates the ability of SCET to factorize jet observ- 
ables in e+e" collisions. Its content is essentially the same as [Ij; further details are 
given in the companion paper [25j and in Jonathan Walsh's thesis [26j. Chapter 4 
discusses predictions and (Monte Carlo) observations of jet substructure in heavy par- 
ticle decays and their QCD background. The theoretical discussion is taken from [2J; 
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the demonstrative plots and accompanying discussions are new. Chapter 5 describes 
and explores a method for improving heavy particle searches using "pruned" jet sub- 
structure. This chapter is also largely drawn from [2J, but the example plots and 
accompanying discussion in Sees. 5.2 and 5.3, as well as the discussions in Sees. 5.6 
and 5.7, are new. Chapter 6 surveys the available software tools for studying jet 
substructure, with emphasis on the tools implementing jet pruning developed by the 
author, and the SpartyJet package, to which the author has made significant con- 
tributions. The text is entirely new. Finally, in Chapter 7, these various strands are 
tied together and the Future of the Jet is considered thoughtfully. 
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Chapter 2 
QCD PHENOMENOLOGY 

Quantum Chromodynamics (QCD) is well established as our best theory of the 
strong interaction, governing the behavior of hadrons such as protons and neutrons as 
well as their constituents, quarks and gluons. QCD is a gauge quantum field theory, 
similar to quantum electrodynamics (QED), Feynman's "strange theory of light and 
matter". As I will review in this chapter, there are several important differences 
between QCD and QED, which lead to a theory at once richer and more challenging. 

I will begin with a review of the QCD Lagrangian, the running of the strong 
coupling, and the twin features of asymptotic freedom and infrared slavery ( "confine- 
ment" , if you prefer) . I then give an example of a perturbative QCD calculation of the 
cross section for e^e~ annihilation into hadrons, on the way encountering many of the 
fundamental issues that appear in perturbative QCD. This will include a discussion 
of several systematic approaches to improving the precision of such calculations. In 
Sec. 2.3, I discuss soft coUinear effective theory, an effective theory of high-energy 
quarks and gluons at particle coUiders. Finally, in Sec. 2.4 I discuss "jets" , the chief 
QCD phenomenon observed and studied in collider experiments. Sec. 2.3 is intended 
as background to Chapter 3; Sec. 2.4 is intended as background to Chapters 4 and 5, 

2.1 Basics 

That this review of QCD phenomenology is incomplete is too obvious to belabor. 
What follows is a hst of references and reviews, themselves incomplete but collectively 
comprehensive. A considerably more exhaustive survey of the QCD literature can be 
found in [27j. 



6 



The basic features of QCD and gauge theories are discussed in the standard text- 
books, e.g. [4, 28, 29, 30j. A more focused resource, geared toward coUider physics, is 
[31j, known universally as "the pink book". [31j also contains a broad set of citations 
to the theoretical and experimental literature. An extensive review of perturbative 
QCD is given in [32j. An extensive review of the non-perturbative aspects of QCD 
(and field theories in general) is given in [33j. References specifically relevant to the 
following sections and chapters will be given therein. 

2.1.1 The QCD Lagrangian 

The Lagrangian of QCD, omitting for now gauge-fixing terms, is 

>Cqcd = -^F^pFf + J2 - ^UQb- (2.1) 

flavors 

The second term represents a set of spin- 1/2 quarks, interacting with a gauge field 
hiding in the covariant derivative Ip (Dirac indices have been suppressed). The gauge 
interaction corresponding to QCD is SU{3), with the gauge charge conventionally 
referred to as "color" . Quarks (antiquarks) live in the fundamental (antifundamental) 
representation of SU (3), so a quark field carries a color index: Qa, where a runs from 1 
to 3. The gauge fields, called gluons, live in the adjoint (dimension 8) representation. 
Note that SU{3) is non-Abehan, so the field strength term —^F^pF^^ contains self- 
interaction terms: 

F^, = [daA^ - d,At - gf^^^A^^A^,] . (2.2) 

The indices {A,B,C} run over the eight color degrees of freedom for the gluon. The 
interaction terms mean that the gluons themselves carry color charge. This is the key 
distinguishing feature between the QCD and QED Lagrangians: gluons interact with 
each other where photons do not. 

The sum over fiavors in Eq. 2.1 runs over the six quark fiavors. As far as the strong 
interaction is concerned these flavors are identical except for their different masses. In 
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the electroweak sector the quarks are grouped into three pairs termed "generations" . 
The quark flavors and their approximate masses are given in Table 2.1. 



Name (symbol) 


Electric charge 


Mass 


Up (u) 


2 
3 


1.5-3.3 MeV 


Down [d) 


1 

3 


3.5-6.0 MeV 


Charm (c) 


2 
3 


1.27+°,°^ GeV 


Strange (,s) 


_ 1 

-_J 


101-f MeV 


Top (t) 


2 
3 


171.2 ±2.1 GeV 


Bottom (b) 


1 
"3 


4.20+°o^/ GeV 



Table 2.1: The six flavors of quarks. Masses are taken from [34j. Note that there 
is some ambiguity in defining a "quark mass" , since quarks do not propagate as free 
particles. See [34j and the references therein for further discussion of this subtle point, 
e.g.: "The estimates of u and d masses are not without controversy and remain under 
active investigation." 



It is worth making the color structure of Eqs. 2.1 and 2.2 explicit. The covariant 
derivatives, acting on quark (color triplet) and gluon (color octet) fields, are 

{D^)ab = d^AB + ig {T'^A'i)^^ . 

The are the eight gluon fields, multiplying the fundamental (adjoint) generators 
(T^y The generators obey the standard SU{N) relations: 

[T^,T^\^if^^^T'', (2.4) 
{T^)bc = -^/^^^, 
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where f^^'^ are the structure constants of SU{N). The normahzation of the funda- 
mental generator matrices is chosen such that 

Trt^t^ =rfl(5^^ = ^(5^^. (2.5) 

An exphcit form of the is not usually necessary, but the following relations are 
useful: 



For SU{3), the color factors are C^? = 4/3 and = 3. 
Gauge fixing and ghosts 

The QCD Lagrangian is invariant under the gauge transformation 

qaix) ^ e(^*-^(^»-g6(a;) = Uix)abqbix), 

Aa ^ U{x)AaU-\x) + '-{daU{x))U-\x). 



(2.6) 



(2.7) 



defined by the matrix-valued function (t-6{x))ab- Before we can define Feynman rules 
for QCD, we must choose a specific gauge to work in. In the absence of a gauge 
choice, the gluon propagator would not be well defined.^ In terms of the functional 
integral, 

J VAVqVqexpi^- J dx^Qcoj , (2.8) 

choosing a gauge corresponds to "factoring out" the integration over the redundant 
space of gauge-equivalent field configurations. The standard procedure is to introduce 
a gauge-fixing term to the functional integral: 

1 = Jv05{G{A'))det^^^y (2.9) 

^The quadratic term for the gauge field is ^ / j0iAa{k){-Pg°''^ + k°'kf')Ap{-k). The integrand 
vanishes for a large space of gauge configurations, and hence the quadratic operator does not have 
a well-defined inverse. 
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where G{A^) is some function of the (transformed) gauge field (Eq. 2.7). For hnear 
G, SG{A^)/S9 is independent of 9{x), so the functional integral (J T>9) factors out. 
We have isolated the integration over different gauge configurations at the expensive 
of introducing a new term to the Lagrangian, the Faddeev-Popov determinant [35j 
in Eq. 2.9, With some manipulation, it can be showri^ that the 6 function and the 
determinant terms can be represented as a functional integral over two additional 
terms in the Lagrangian: 



The gauge-fixing term modifies the gluon propagator. Any value of A is allowed, 
different values corresponding to different gauge choices. The ghost term introduces 
a pair of complex, scalar, anti-commuting "fields" which come with their 

own functional integral. These "Faddeev-Popov ghosts" are not physical particles 
— they do not even exist in certain gauges! — but they can be treated as such in 
the calculation of Feynman diagrams. In practice, ghosts only appear in certain loop 
diagrams, since they are never external legs. 

The Feynman rules arising from the gauge-fixed QCD Lagrangian arc given in 
Fig. 2.1, Note the appearance of the free parameter A in the gluon propagator. Any 
value of A can be used; any gauge-invariant calculation will be independent of the 
choice. A = 1(0) is the Feynman-'t Hooft (Landau) gauge. 

2.1.2 Running of ag 

The most important difference in the phenomenology of QED and QCD is in the 
renormalization flow of the couplings. At lowest order in perturbation theory, we find 
for both theories a result of the form [31j: 



■gauge-fixing — <-> \ ) ) 




(2.10) 




(2.11) 



^See, e.g., ^4 Sections 9.4 and 16.2. 
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p^—m'^+ie 



gab 



A, PL 



B, V 



A » B 

P 



C, A 



A, II 




5, z/ 



(all momenta incoming) 



A, II 



B, V 



A, 11 




C, A 



_j^2 fXAC j^XBD^gixVgXn _ giiKgvX^ 

— ig^ f-^^^ jXBC (^gUV g\K _ gliXgVK^ 



B 



C 



gfABCp^ 



Figure 2.1: Feynman rules for QCD. The "ghost" fields (dotted fines) can be treated 
as anti-commuting scalars that only propagate internally. 
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where Oi — ^ for QED and Oi — ^ for QCD. The crucial difference hes in the sign 
of 60, which for QED is positive and for QCD negative. At small energies the QED 
coupling asymptotes to a small value, l/a{ii) ~ I/ckq ~ 137, (me 7^ cuts off the 
running at < fn^), growing logarithmically at larger energies: l/a{mz) ~ 128. 
For QCD however, the coupling grows logarithmically smaller at large energies and 
diverges at small energies. At the scale of mz, OLgimz) ~ 0.118 is small enough to 
calculate interactions perturbatively. However, at scales ji ~ Aqcd ~ 200 MeV, the 
perturbative result (Eq. 2.11) diverges. This does not mean the coupling itself is 
becoming infinite, only that it is becoming large enough that perturbation theory 
is breaking down. We observe that at low energies quarks are bound together in 
hadronic states, and the perturbative breakdown of QCD is this regime indicates 
that quarks and gluons are not the appropriate degrees of freedom at low energies. 
In fact the mass scale of the lightest hadrons is about 200 MeV, confirming that this 
is the relevant scale for low-energy QCD. That quarks are observed only as bound 
states is known as "confinement" ; that the coupling becomes small at large energies 
is known as "asymptotic freedom" . 

2.1.3 Confinement vs. asymptotic freedom and factorization 

The twin phenomena of confinement and asymptotic freedom have important conse- 
quences for QCD phenomenology. Confinement implies that quarks and gluons are 
not well-defined "particles" in the sense of asymptotic states that propagate freely. 
The coupling binding quarks together in hadrons is so strong that individual quarks 
can never be removed. In particular, the binding energy between quarks is 0(200 
MeV), but the lightest quark masses are (9(5 MeV). As two quarks in a meson are 
pulled apart, creating an additional qq pair from the vacuum becomes energetically 
favorable, resulting in two mesons. At low energies, or equivalently large distance 
scales, hadrons — not quarks or gluons — are the relevant degrees of freedom. 

Asymptotic freedom, meanwhile, means that at high energies the quarks and glu- 
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ons in hadrons will behave like free particles. Probed at high energies, a proton will 
appear to be a collection of weakly interacting quarks and gluons, or "partons" . For 
example, in a fast-moving proton, partons can only exchange large amounts of longi- 
tudinal momentum: the relevant scale is the invariant mass of the exchanged gluon, 
which is small if the exchanged momentum is longitudinal and large if it is transverse. 
Large transverse momentum fluctuations involve a factor of Q;s(pr) and are therefore 
suppressed. This leads to the picture of a high-momentum proton as a collection of 
partons, all moving in the same direction, each carrying some fraction of the total 
momentum. 

Similarly, a collision involving large transverse momentum exchange will "resolve" 
the parton structure of the proton; interactions involving more than one parton are 
suppressed. The cross section for the process pp — > q'q' can be related to the partonic 
cross section a{qq — > 7* — > q'q'), which can be calculated perturbatively. Explicitly, 
we can factorize a proton collision cross section into a partonic cross section convolved 
with functions that give the probability to find partons with specific momenta inside 
a proton: 

a{p{ki)p' {k2) X) ^ j dxidx2a {{q{xiki)q' {x2k2) X)) fq{xi, n), fg<{x2, IJ,) 

The "parton distribution functions" fq depend on the parton flavor, momentum frac- 
tion X, and some energy scale /i — the "factorization scale" — which is not well defined 
but is generally taken to be related to some scale characteristic of the qq' — > X pro- 
cess. The parton distribution functions characterize the low-energy, non-perturbative 
interaction of partons within a proton and cannot be predicted using perturbative 
QCD. They are however universal across a broad class of processes, and can therefore 
be measured once and used as an input to other analyses. 

The largeness of the coupling at low energies makes it inevitable that the incoming 
and outgoing quarks will radiate energy away in the form of lower-energy gluons, 
that the gluons will themselves radiate and split into qq pairs, and that many low- 
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energy partons will result. At a "hadronization scale" O(Aqcd), these partons arrange 
themselves into color singlets — hadrons hke pions and protons. The basic QCD 
observable at high-energy colliders are "jets" of hadrons, about which much more will 
be said in Sec. 2.4, 

2.2 Perturbative QCD example: e+e" — > hadrons 

We now consider an example calculation in perturbative QCD which although sim- 
ple will exhibit many of the features of QCD relevant to collider experiments. The 
simplest collider process that involves QCD in a fundamental way is the production 
of jets at electron-positron colliders. The presence of strongly-interacting particles in 
the initial state at ep or ppipp) colliders introduces additional complications we will 
consider in Sec. 2.4, 

The simplest calculation in QED is the scattering cross section e+e~ — )■ 
The QCD analog is the process e+e" — )■ qq, the annihilation of an e+e~ pair into 
a quark and an anti-quark. The tree-level Feynman diagrams for each process are 
shown in Fig. 2.2, Of course, whereas muons propagate for distances comparable to 
the size of a particle detector and thus can be directly detected, quarks cannot. With 
a quark-antiquark pair produced initially, we know that they must radiate additional 
colored partons, all of which eventually organize into hadrons at a lower energy scale. 
We might worry that trying to calculate a{e'^e~ hadrons) in terms of ^(e+e" — >■ qq) 
is hopeless. 

We are saved, however, by asymptotic freedom. For a high-energy e+e~ collision, 
with (pe+ + Pe-Y = s 3> Aqcd, the "parton-level" e+e" qq process and the 
subsequent radiation and hadronization factorize. This is the first example we will 
see of a much more general phenomenon in QCD that relies on the running of the 
strong coupling and a wide separation of energy scales. At the scale of the parton- 
level process — known as the "hard scattering" due to the large energy scale involved 
— as{s) <C 1 and perturbation theory is useful. Corrections to the tree-level process 
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^ (a) ^J' ^ (b) 

Figure 2.2: Feynman diagrams for (a) e~^e' — )■ and (b) e'^e" — >■ qq. 

involving high-energy gluons are perturbative and can be treated as a correction. Low- 
energy radiation and hadronization, while non-perturbative, occur at a lower energy 
scale. [31j gives a nice picture of factorization in this case: consider the process as 
a function of time. The e'^e" pair comes together and first annihilates into an off- 
shell photon or Z boson. The uncertainty principle dictates that this intermediate 
particle can only propagate for a time (or distance) inversely proportional to its 
energy: t ~ x ~ {^/s)~^. If ^/s is much larger than the energy scale for radiation and 
hadronization, then those processes occur over a much longer time scale and do not 
resolve the effectively instantaneous annihilation. We can then assume that whatever 
happens subsequent to the hard scattering occurs with probability 1, so that the total 
cross section is simply the parton-level cross section: 

a{e'^e~ — >■ hadrons) = a{e'^e~ — >■ qq) + perturbative corrections. (2-12) 

In the following subsections, we will explore the perturbative calculation of this cross 
section as well as systematic methods of improvement. A much more detailed version 
of this calculation is given in Appendix A , 

2. 2. 1 Tree-level prediction 

The tree-level diagram for e+e~ qq is given in Fig. 2.2, The intermediate boson can 

be a photon or a Z, but we will only consider the case of a photon. At leading order in 
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the electroweak coupling, including the Z amplitude only contributes an overall factor 
to the total cross section. A simple calculation yields the differential cross section 

=—^{1 + 008^9), 2.13 

a cos 9 2s 

which can be integrated to yield the total cross section 

CTtree = = CJoQI (2.14) 

It is common to define the ratio R of the total cross section for annihilation to 
hadrons versus muons: 

(7(e+e- ^ //+//-) (7(e+e- ^ //+//-) g 

The sum is over quark flavors; a sum over quark colors has already been performed 
to yield the factor of 3. 

2.2.2 Next-to-leading order 

At the first non-trivial order in as, five additional diagrams contribute to the processes 
e+e^ — > qq and e+e~ — ?> qqg, shown in Fig. 2.3. If wc arc measuring the inclusive 
cross section a{e~^e~ — )■ hadrons) we must include both of these processes. If we wish 
to calculate a differential cross section in the three body phase space, only the qqg 
process contributes to 0{as), but as we will see we must be careful to restrict ourselves 
to regions of phase space where a perturbative expansion in is well behaved. 

As I discuss in more detail in Appendix A, the squared matrix elements for the 
real emission (qqg) and virtual {qq) diagrams are separately divergent in the infrared. 
Calculationally, these divergences come from internal quark or gluon propagators 
going on shell. 

We can see the divergence explicitly if we consider the differential cross section in 
the energies of the two quarks. Writing Xi = 2ki ■ q/q"^, we have: 



dxidx2 3s 27r 



{xl + xl) 



[I - Xi){l - X2) 



(2.16) 
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Figure 2.3: Feynman diagrams for (a) e+e — > qq and (b) e+e — > qqg. 

While this differential cross section is well behaved for large xi,X2, it diverges for 
xi — )■ 1 and/or X2 — >■ 1. Physically, these are the regions of phase space where the 
gluon is either collinear with one quark or the other, or the gluon is soft. 

With a suitable infrared regulator, we find that the sum of the real and virtual 
diagrams is finite. The divergences only arise because of our insistence on describing 
the event in terms of quarks and gluons, which are not sensible degrees of freedom 
over all of phase space. Performing the calculation requires a choice of regulator; in 
this thesis I use dimensional rcgularization (see, e.g., [4J). In o? = 4 — 2e dimensions, 
the real and virtual contributions to the total cross section are given be Eqs. A. 59 
and A.52: 

Creal — ^0 
Cvirt — Co 

I have performed the sum over colors and left a sum over flavors. i?(e) is deflned in 

Eq. A. 50 and is equal to 1 + 0{e). Adding these to the tree-level contribution yields 



2 3 , o 

+ - + 19/2 - TT^ 



e2 e 



2 3 



(2.17) 
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the finite final answer (Eq. A. 60) 



a{e'^e hadrons) 



= (Jo 




(2.18) 



We can now see the contrast between two types of QCD calculations. Some 
perturbative calculations will yield finite answers; some will not. The distinction 
will be whether the calculation adds together contributions that contribute to the 
same observable phase space — sometimes described as whether the calculation is 
"suitably inclusive" . The cross section for (e+e~ — > qq) is not finite beyond tree-level 
due to an infinite virtual correction. The total cross section for (e+e~ — > hadrons) 
on the other hand is finite because there are canceling divergences in the two- and 
three-parton cross sections. Of course, the total cross section is not the only quantity 
we can calculate that is well-defined. Provided we group the singular pieces of the 
real contribution with the virtual part, the resulting "two-body" and "three-body" 
calculations will be separately finite — e.g. the differential cross section Eq. 2.16 for 
large Xi,X2- This leads to the idea of jets, which we will discuss further in Sec. 2.4. 



2.2.3 Logarithmic resummation 

Calculations in perturbative QCD that involve multiple scales will typically depend 
on logarithms of ratios of those scales. We will see an explicit example of this in the 
calculation in Chapter 3 , If the lower scale is regulating an infrared divergence in the 
differential cross section, up to two powers of the logarithm will appear at every order 
in perturbation theory, corresponding to the double singularity for gluon emission 
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seen in Eq. 2.16: 



/ 



dX 



(2.19) 



+ . . . 

Often the logarithmic dependence will exponentiate, meaning that the cross sec- 
tion can be written: 



All of the a" L^" terms in the expansion are captured by the a^L^ term in the exponent, 
which only contains logarithms up to q;^L"+^. In the terminology of Chapter 3, terms 
of order a^L""*"^ are "leading logarithmic (LL)", terms of order a^L" are "next-to- 
leading logarithmic (NLL)" , etc. In general, perturbation theory including logarithmic 
resummation exhibits greater convergence, particularly in the regions of phase space 
where the logarithms are large. 

2.3 Effective theories of QCD 

In the previous section we saw hints that seemingly straightforward calculations in 
perturbative QCD can be difficult to perform and subject to large corrections due to 
logarithms of ratios of scales. These issues can at least partly be addressed by using 
effective theories of QCD. Effective theories, in the "top-down" approach where we 
know the full theory already, are simply field theories in which some modes have been 
integrated out, leaving a different set of operators in an effective Lagrangian for the 
remaining modes. The classic example is the Fermi theory of the weak interaction 
where the W boson is integrated out, leaving (non-renormalizable) four-fermion inter- 
actions. In general, an effective theory removes particles above some mass or energy 




•s 



(2.20) 



+ ...] 
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scale in order to simplify the description of physics below that scale. By construction 
it must reproduce the low-energy physics of the full theory, up to corrections 0(j)/A), 
where p is some relevant scale for the problem and A is "cutoff" scale that delineates 
what has been integrated out. In the case of Fermi theory A ~ mw 

In this thesis I will consider a particular effective theory of QCD, soft-collinear 
effective theory (SCET) [11, 12, 13, 14J, an effective theory relevant to radiation 
from high-energy quarks and gluons. High-energy, large-angle — perturbative — 
emissions are integrated out, leaving only low-energy (soft) and small-angle (coUinear) 
degrees of freedom. In Chapter 3 wc will sec that this formulation, after suitable field- 
redefinitions, decouples the soft and coUinear modes from each other. This allows 
jet-based cross sections to be factorized into several pieces, each of which depends on 
a single momentum scale and hence contains no large logarithms. 

The remainder of this section will be a review of SCET. At the end of the next 
section (Sec. 2.4.5), I will briefly review the class of observable considered in Chapter 
3, 

2.3.1 Review of SCET 

SCET is the effective field theory for QCD with all degrees of freedom integrated out, 
other than those traveling with large energy but small virtuality along a light-like 
trajectory n, and those with small momenta in all components.^ A particularly useful 
set of coordinates is light-cone coordinates, which uses light-like directions n and n, 
with — — and n • n — 2. In Minkowski coordinates, we take n — (1, 0, 0, 1) 
and n — (1, 0, 0, —1), corresponding to coUinear particles moving in the +z direction. 
A generic four- vector p^^ can be decomposed into components 

— n ■ p— + n ■ p— + p^. 



^This subsection is taken, with small edits, from Sec. 4.1 of [25J. 
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In terms of these components, p — {n ■ p,n • p,p±), coUinear and soft momenta scale 
with some small parameter A as 

Pn^E{l,X^,X), ps^E{X\X\X^), (2.21) 

where £^ is a large energy scale, for example, the center-of-mass energy in an e'^e" 
collision. A is then the ratio of the typical transverse momentum of the constituents 
of the jet to the total jet energy. Quark and gluon fields in QCD are divided into 

coUinear and soft effective theory fields with these respective momentum scalings: 

q{x) = qn{x) + qs{x), A^{x) = A';^{x) + A^(a;). (2.22) 

We factor out a phase containing the largest components of the coUinear momentum 
from the fields g„, An- Defining the "label" momentum = n-pn^ +P±, where n-pn 
contains the 0{1) part of the large light-cone component of the coUinear momentum 
Pn, and p± the 0{X) transverse component, we can partition the coUinear fields Qn, A„ 
into their labeled components, 

q^ix) = J2 ^"^^■^?n,p(x), A'^lix) = J2 (2-23) 

The sums are over a discrete set of A) label momenta into which momentum 
space is partitioned. The bin p = is omitted to avoid double-counting the soft mode 
in Eq. (2.22) [36j. The labeled fields qn,p,An^p now have spacetime fiuctuations in 
X which are conjugate to "residual" momenta k of order EX^, describing remaining 
fluctuations within each labeled momentum partition [13, 36j. It will be convenient to 
deflne label operators = n- Vn'^/2 + which pick out just the label components 
of momentum of a coUinear field: 

V''(t>nA^)=P^Kpi.^)- (2-24) 
Ordinary derivatives acting on effective theory fields 0„,p(a;) are of order EX^. 
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The final step to construct the efi^ective theory fields is to isolate the two large 
components of the Dirac spinor g„^p for a fermion with lightlike momentum along n. 
The large components ^n,p and the small S„^p can be separated by the projections 



^n,p — ^ Qn,p, ^n,p — ^ (ln,p, 



(2.25) 



and wc have Qn,p — ^n,'p ~l" ^n,p- One can show, substituting these definitions into the 
QCD Lagrangian, that the fields 5„p have an effective mass of order E and can be 
integrated out of the theory. The effective theory Lagrangian at leading order in A is 
[12, 13, 14J 

CSGET = C^ + J^A^+Cs, (2.26) 
where the coUinear quark Lagrangian is 



in-D + iplWn{x)^—-Wl{x)ipl 



in ■ V 

where Wn is the Wilson fine of coUinear gluons, 



Wn{x) = ^ exp 



perms 



-g—^n-An{x) 
11 ■ V 



(2.27) 



(2.28) 



the coUinear gluon Lagrangian is 

= ^ Trj \^V^ + iV^ + gA, 
+ 2Tr|cn[zP^, \%V^^gA^^,Cn 



+ -Tr 
a. 



(2.29) 



where is the coUinear ghost field and a the gauge-fixing parameter; and the soft 
Lagrangian Ls is 

C^s = U$s(ls{x) - ^ Tr Gra^,(a:), (2.30) 

which is identical to the form of the full QCD Lagrangian (the usual gauge-fixing 
terms are implicit). In the coUinear Lagrangians, we have defined several covariant 
derivative operators, 

^d^ -igAl-igNl, iD^^^V^ ^ gA^, ^V^ ^in-D — . (2.31) 
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In addition, there is an implicit sum over the label momenta of each coUinear field 
and the requirement that the total label momentum of each term in the Lagrangian 
be zero. 

Note the soft quarks do not couple to coUinear particles at leading order in A. 
Meanwhile, the coupling of the soft gluon field to a coUinear field is in the component 
n-Ag only, according to Eqs. (2.27) and (2.29), which makes possible the decoupling 
of such interactions through a field redefinition of the soft gluon field given in [14j. 
We will utilize this soft-coUinear decoupling to simplify the proof of factorization in 
Chapter 3, 

The SCET Lagrangian Eq. (2.26) may be extended to include coUinear particles 
in more than one direction [37j. One adds multiple copies of the coUinear quark and 
gluon Lagrangians Eqs. (2.27) and (2.29) together. The coUinear fields in each di- 
rection rii constitute their own independent set of quark and gluon fields, and are 
governed in principle by different expansion parameters A associated with the trans- 
verse momentum of each jet, set either by the angular cut R in the jet algorithm or 
by the measured value of the jet shape To. Each coUinear sector may be paired with 
its own associated soft field ^4^ with momentum of order EX^ with the appropriate 
A. For the purposes of keeping the notation tractable while proving the factorization 
theorem in this section, we will for simplicity take all A's to be the same, with a single 
soft gluon field Ag coupling to coUinear modes in all sectors. In [25j we discuss how 
to "refactorize" the soft function further into separate soft functions each depending 
only on one of the various possible soft scales. 

The effective theory containing N coUinear sectors and the soft sector is appro- 
priate to describe QCD processes with strongly-interacting particles coUimated in N 
well-separated directions. Thus, in addition to the power counting in the small pa- 
rameter A within each sector, guaranteeing that the particles in each direction are 
well coUimated, we will find in calculating an A'"-jet cross section the need for another 
parameter that guarantees that the different directions rii are well separated. This 
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latter condition requires tij ^ 1, where tij is defined for jets i and j in Eq. (3.1).^ 
2.4 Jet physics and collider phenomenology 

In Sec. 2.2 we saw that perturbative QCD predictions were finite when we combined 
cross sections in such a way that we included all processes leading to the same observ- 
able final state. This observation is the basis of jet physics. Whereas the "two-parton" 
and "three-parton" cross sections were both infinite at NLO, the "two-jet" cross sec- 
tion, where we combine the two-parton cross section with the soft/coUinear parts 
of the three-parton cross section, was finite. Likewise, the "three-jet" cross section, 
where we restrict the three partons to be well separated by some metric, will also 
be finite. To make this more precise, we need a "jet algorithm" , which I will discuss 
more carefully in Sec. 2.4.4, In terms of a perturbative calculation, the role of a "jet 
algorithm" is to combine different final states such that the appropriate real and vir- 
tual diagrams have canceling singularities. An algorithm that docs this in all cases 
is said to be "infrared safe"; one that does not, at least for some configurations, is 
said to be "infrared unsafe", or perhaps more accurately "infrared sensitive". With 
this goal in mind, we now review some of the basic collider physics relevant to the 
production and reconstruction of jets. 

2.4-1 The parton shower and hadronization 

We can see the need for something like jets by considering two- and three-parton final 
states in e'^e~ collisions, but the same effects are present at every order in perturbation 
theory. A final state with n partons will have, at tree level, real singularities that 
must cancel against virtual singularities in all m-parton processes for m < n. A 

^This condition is a consequence of our insistcincc; on using operators with exactly N directions 
to create the final state. We could move away from the large-f limit and accoimt for corrections 
to it by using a basis of operators with arbitrary numbers of jets and properly accounting for the 
regions of overlap between an N jet operator and (A^± l)-jet operators. This is outside the scope 
of the present work, where we limit ourselves to kinematics well described by an A''-jet operator, 
and thus, limit ourselves to the large-t limit. 
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jet algorithm combines these cancehng singularities by re-arranging n-parton phase 
space into A^-jet phase space where cross sections are individually finite. The "parton 
shower" is the process by which a high-energy quark or gluon radiates many more 
gluons, which themselves can radiate and (for gluons) split into qq pairs. The radiation 
is dominated by the soft / coUinear singularities in the gluon emission cross section seen 
in Sec. 2.2, The jet algorithm can be thought of as trying to reverse this process. 

An additional complication arises once the n-parton final state hadronizes. Whereas 
a partonic final state can in principal be grouped such that there is a one-to-one map- 
ping of jets to initial partons (ignoring interference), this is no longer possible after 
hadronization. A jet algorithm acting on hadrons must produce groups of hadrons, 
necessarily color singlets, which can never be mapped unambiguously to colored initial 
partons.^ This means that the standard language of equating jets with initial partons 
is always subject to corrections, expected to be 0(Aqcd/Q), where Q is some relevant 
hard scale. An important consideration in the evaluation of a jet algorithm is the size 
of hadronization corrections (see, e.g., the discussion in [38j). 

2.4-2 Observing jets 

Every event at an e"'"e~ collider that produces strongly interacting particles, and ev- 
ery event at a hadron collider, involves jets in a fundamental way. The ability to 
measure and understand jets is therefore central to collider physics. Modern detector 
experiments observe jets primarily as energy depositions in a calorimeter: the set of 
energetic hadrons produced in the collision is seen as a two- or three-dimensional dis- 
tribution of energy. Information from a tracking system, where the paths of individual 
particles can be observed, is also increasingly being used in the study of jets. 

At the LHC, the principal detectors are ATLAS [39j and CMS [40j. As far as 

^An exception to this rule is the ARCLUS dipole clustering algorithm [38 , which proceeds via 
3^2 recombinations and docs not assign hadrons to specific jets. Of course, this does not solve 
the problem of ambiguity so much as accept it as unavoidable. 
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jet measurements are concerned, they share a few essential features. Both detectors 
are roughly cylindrical and surround the point of interaction, providing full cover- 
age out to jr^l = I — lntan(^/2)| ~ 5.^ The innermost layers are tracking layers, 
which pinpoint locations where charged particles pass. With multiple tracking lay- 
ers, the paths of individual particles can be reconstructed with high precision. The 
entire system is placed within a magnetic field, so measuring the curvature of a par- 
ticle's path determines its momentum. Beyond the tracking system are two levels 
of calorimetry: and electromagnetic and hadronic calorimeters. Calorimeters absorb 
and measure the energy of particles entering them. The electromagnetic calorimeter 
is thick enough to absorb essentially all of the energy contained in electron or photon 
showers, but high energy hadrons like nucleons and pions will only deposit some of 
their energy in this layer and must be stopped by the hadronic calorimeter. The 
hadronic calorimeters are larger and less finely segmented than the electromagnetic 
calorimeters. The segmentation for both ATLAS and CMS hadronic calorimeters is 
approximately A?] x Ac/) = 0.1 x 0.1. 

A typical event at the LHC will have many calorimeter cells with significant (pr ^ 
1 GeV) energy deposition, which must be organized into jets for analysis. One possible 
input to a jet algorithm is simply the set of calorimeter cells, each having some 
measured energy and associated with an direction. Assuming that an individual cell 
corresponds either to a single particle or multiple essentially coUinear particles, we 
can assign it a four-momentum by assuming that the corresponding mass is zero. We 
can imagine an "ideal calorimeter" that only combined nearby particles in this way 
(but did not have any uncertainty on the total four-momentum). A reasonable jet 
algorithm should at minimum be insensitive to this kind of initial merging of nearby 
particles. 

Two interesting possibilities exist to supplement the information from the hadronic 

^Scc the glossary item Hadron Collider, Variables for definitions of the various kinematic 
variables used at hadron colliders, and the reasons for their use. 
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calorimeter in defining the inputs to a jet algorithm. First, particles in a jet also 
deposit energy in the electromagnetic calorimeter, which has higher spatial resolution. 
Using information from the electromagnetic calorimeter could allow the resolution of 
smaller-scale features in jet physics. This could be particularly useful in the case of 
jets from heavy particle decays at very large transverse momentum, where the decay 
products become boosted very close together. 

A second possibility is the use of tracking information in describing jets. In prin- 
ciple, tracks can identify single particles and measure their momentum more precisely 
than the calorimeters measure their energy. CMS, for example, uses a "jets-plus- 
tracks" algorithm [41j that improves jet energy resolution by using the tracking system 
to measure the momentum of charged particles in the jet (including particles that are 
bent out of the jet cone by the magnetic field). CMS also uses a "particle fiow" algo- 
rithm [42j that attempts to distinguish electrons, photons, charged hadrons, neutral 
hadrons, and muons based on their activity in multiple detector layers — identified 
particles can then be individually calibrated. Both methods significantly improve the 
final jet energy resolution [43j. 

2.4-3 The event environment at the LHC 

An event at the LHC is considerably more complicated than the simple e^e~ qq 
events imagined in Sec. 2.2, Most of the complications arise from the simple difference 
that the LHC will coUide protons, which are composite objects. Rather than collide 
quarks or gluons (which would be ideal), the LHC will coUide bags of them — protons. 
The asymptotic freedom of QCD means that high-energy proton interactions can be 
viewed as perturbative interactions between relatively free quarks and gluons, with 
the remainder of the protons acting as spectators. Unfortunately, asymptotic freedom 
also means that partonic (quark-on-quark, say) collisions involving large transverse 
momentum transfer — and hence involving agin) evaluated at a large scale — are 
rare relative to the overall inelastic (proton- breaking) cross section. 
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In a hadron collider, the strongly-interacting incoming partons can radiate prior 
to the hard interaction (initial state radiation, ISR). This adds to the radiation from 
any outgoing colored partons (final state radiation, FSR). Moreover, the remnants 
of the proton can also interact with each other (multiple parton interactions, MPI; 
also known as the underlying event (UE)). In principle this must happen to some 
degree because the beam remnants are not color singlets and must interact at least 
enough to hadronize. Likewise, initial state radiation and the underlying event are 
not in general independent from the hard scattering final state due to color connec- 
tions. If the final state is colored {a g ^ tt event, say), there is not even a unique 
assignment of outgoing hadrons to FSR, ISR or UE. Moreover, quantum mechanics 
allows interference between these processes. Of the three, the underlying event is the 
most difficult to model and measure; for an extensive selection of recent work on this 
subject see [44j. 



One final contribution adds to hadronic activity in an LHC event: pile-up (PU). 
The LHC is designed to collide bunches of many protons at once to increase the 
likelihood of a high-transverse-momentum interaction. In the background to these 
events, however, are much lower-energy collisions between other proton pairs. At full 
design luminosity the LHC will observe approximately 25 collisions at once! While 
pile-up, unlike ISR and UE, is truly independent of the final state physics, at large 
luminosities it grows in importance. 



All of these effects of the hadronic environment make it more difficult to predict 
and observe phenomena at the LHC. We will see in Chapter 5 that techniques that 

reduce these eff'ects can significantly improve the performance of LHC searches. 
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2.4-4 Jets and jet algorithms 

To make sense of the multiplicity of hadrons produced in collisions with final-state 
quarks or gluons, we group them into jets (for two good reviews, see [23j and [45j)7 
High-energy quarks and gluons radiate many more gluons and qq pairs, but that the 
radiation is dominantly soft and/or collinear. This means that most of the energy 
of the initial parton will be located in a small angular area in the detector, plus 
low-energy deposits at larger angle. Large-energy, large- angle radiation can only 
come from perturbative emission, and therefore tends to happen with probability 
~ as{pTj) ~ 0.1. An ATLAS event with two jets is shown in Fig. 2.4. 




Figure 2.4: Several event views for an event at ATLAS. Two high-energy "jets" have 
been identified, along with several much lower-energy jets clustered around them 
(colored circles in right plot). Taken from the ATLAS public website [39j. 



^This subsection, with small modifications, is taken from Sec. II of [2 . 



29 



Recombination algorithms 

To identify jets we need a jet algorithm. Jet algorithms can be broadly divided into two 
categories, recombination algorithms and cone algorithms. Both types of algorithms 
form jets from protojets, which are initially generic objects such as calorimeter towers, 
topological clusters^, or final state particles. Cone algorithms fit protojets within a 
fixed geometric shape, the cone, and attempt to find stable configurations of those 
shapes to find jets. In the cone-jet language, "stable" means that the direction of 
the total four-momentum of the protojets in the cone matches the direction of the 
axis of the cone. Recombination algorithms, on the other hand, give a prescription 
to pairwise (re)combine protojets into new protojets, eventually yielding a jet. For 
the recombination algorithms studied in this work, this prescription is based on an 
understanding of how the QCD shower operates, so that the recombination algorithm 
attempts to undo the effects of showering and approximately trace back to objects 
coming from the hard scattering. The anti-kx algorithm [46] functions more like the 
original cone algorithms, and its recombination scheme is not designed to backtrack 
through the QCD shower. Cone algorithms have been the standard in coUider exper- 
iments, but recombination algorithms are finding more frequent use. Analyses at the 
Tevatron [47j have shown that the most common cone and recombination algorithms 
agree in measurements of jet cross sections. In this work we are most interested in jet 
substructure, and we therefore consider only recombination algorithms, which define 
substructure in a natural way. 

A general recombination algorithm uses a distance measure pij between protojets 
to control how they are merged. A "beam distance" pi determines when a protojet 
should be promoted to a jet. The algorithm proceeds as follows: 

0. Form a list L of all protojets to be merged. 

addition to single cells, ATLAS also uses three-dimensional "topological clusters" of calorime- 
ter cells as inputs to jet analyses. Topological clustering is a method of combining nearby cells 
into an object significant and well enough measured to be locally calibrated. 
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1. Calculate the distance between all pairs of protojets in L using the metric p^, 
and the beam distance for each protojet in L using pi. 

2. Find the smallest overall distance in the set {pi,Pij}- 

3. If this smallest distance is a pij, merge protojets i and j by adding their four 
vectors. Replace the pair of protojets in L with this new merged protojet. If 
the smallest distance is a pi, promote protojet i to a jet and remove it from L. 

4. Iterate this process until L is empty, i.e., all protojets have been promoted to 
jets.^ 

For the kx [48, 49, 50j, Cambridge- Aachen (CA) [51j, and anti-kx [46j recombina- 
tion algorithms the metrics are 

kx : pij = min (pn , Prj )ARij/D, pi= pn ; 

CA : pij = AR,,/D, = 1. (2.32) 

anti-kx : Pij = mm{p^l,p:^^)ARij/D, pi = p^/. 

Note that all three are specific instances of the general metric with parameter a: 

generic kx : Pij = mm{p^^,p^j)ARij/D, pi = p^-. (2.33) 

Here is the transverse momentum of protojet i and ARij = ^/(^j^-^jP^I-^yi^-^jP 
is a measure of the angle between two protojets that is invariant under boosts along 
and rotations around the beam direction. is the azimuthal angle around the beam 
direction, — iaxT^ py/px, and y is the rapidity, y — iWLh~^ p^/ E, with the beam 
along the z axis. The angular parameter D governs when protojets should be pro- 
moted to jets: it determines when a protojet's beam distance is less than the distance 

^This defines an inclusive algorithm. For an exclusive algorithm, there are no promotions, but 
instead of recombining until L is empty, mergings proceed until all pij exceed a fixed Pcut- 
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to other objects. D provides a rough measure of the typical angular size (in of 
the resulting jets. 

The recombination metric determines the order in which protojets are merged 
in the jet, with recombinations that minimize the metric performed first. Prom the 
definitions of the recombination metrics in Eq. (2.32), it is clear that the kx algorithm 
tends to merge low-p^ protojets earlier, while the CA algorithm merges pairs in strict 
angular order. This distinction will be very important in our subsequent discussion. 
Anti-kx, meanwhile, tends to cluster protojets around the hardest protojet, producing 
cone-hke jets with less interesting substructure. 

These definitions arc all appropriate for finding jets at a hadron collider, where 
invariance under longitudinal boosts is desired. At an e+e" collider, pt is replaced 
by E, and Ai?^ is typically replaced by (1 — cos^). Moreover, the beam metric pi is 
not used; instead, merging proceeds until all pij exceed some (usually dimensionful) 
value i/cut which depends on the center-of-mass energy Q'^. 

Jet Substructure 

A recombination algorithm naturally defines substructure for the jet. The sequence 
of recombinations tells us how to construct the jet in step-by-step 2^1 mergings, 
and we can unfold the jet into two, three, or more subjets by undoing the last recom- 
binations. The jet algorithm begins and ends with physically meaningful information 
(starting at calorimeter cells, for example, and ending at jets), so we might expect 
that the intermediate (subjet) information to have physical significance as well. In 
particular, we expect the earliest recombinations to approximately reconstruct the 
QCD shower, while the last recombinations in the algorithm, those involving the 
largest-pT degrees of freedom, may indicate whether the jet was produced by QCD 
alone or a heavy particle decay plus QCD showering. This will be true for the CA 
and kx algorithms, where the metric reflects the soft (kx) and coUinear (CA and kx) 
dynamics of the parton shower. To discuss the details of jet substructure, we begin 
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by defining relevant variables. 

Variables Describing Branchings and Their Kinematics 

Whereas the jet algorithm can be thought of as a sequence of mergings, the parton 
shower, possibly preceded by a decay, can be thought of as a sequence of branchings. 
In studying the substructure produced by jet algorithms, it will be useful to describe 
branchings using a set of kinematic variables. Since we will consider the substructure 
of (massive) jets reconstructing kinematic decays and of QCD jets, there are two 
natural choices of variables. Jet-rest-frame variables are useful to understand decays 
because the decay cross section takes a simple form. Lab-frame variables are useful 
because jet algorithms are formulated in the lab frame, so algorithm systematics are 
most easily understood there. The QCD soft/coUinear singularity structure is also 
easy to express in lab frame variables. 

Naively, there are twelve variables completely describing a 1 ^ 2 splitting. Here 
we will focus on the top branching (the last merging) of the jet splitting into two 
daughter subjets, which we will label J — > 1,2. Imposing the four constraints from 
momentum conservation to the branching leaves eight independent variables. The 
invariance of the algorithm metrics under longitudinal boosts and azimuthal rotations 
removes two of these (they are irrelevant) . For simplicity we will use this invariance 
to set the jet's direction to be along the x axis, defining the z axis to be along the 
beam direction. Therefore there are six relevant variables needed to describe a 1 — > 2 
branching. Three of these variables are related to the three-momenta of the jet and 
subjets, and the other three are related to their masses. 

Of the six variables, only one needs to be dimensionful, and we can describe all 
other scales in terms of this one. We choose the mass mj of the jet. In addition, we 
use the masses of the two daughter subjets scaled by the jet mass: 



mi 

ai = — and 02 = 
mj 



mj 



m2 



(2.34) 
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We choose the particle labeled by '1' to be the heavier particle, Oi > 02- The three 
masses, mj, ai, and 02, will be common to both sets of variables. Additionally, we 
will typically want to fix the pt of the jet and determine how the kinematics of a 
system change as ptj is varied. For QCD, a useful dimensionless quantity is the ratio 
of the mass and pt of the jet, whose square we call xj: 

ID 2 

XJ = (2.35) 

For decays, wc will opt instead to use the familiar magnitude 7 of the boost of the 
heavy particle from its rest frame to the lab frame, which is related to xj by 



7 = J- + 1, ^J=^—,- (2-36) 

The remaining two variables, which are related to the momenta of the subjets, will 
differ between the rest-frame and lab-frame descriptions of the splitting. 

Unpolarized 1 — > 2 decays are naturally described in their rest frame by two angles. 
These angles are the polar and azimuthal angles of one particle (the heavier one, say) 
with respect to the direction of the boost to the lab frame, and we label them and 
00 respectively. Since we are choosing that the final jet be in the x direction, 6*0 is 
measured from the x direction while 0o is the angle in the y-z plane, which we choose 
to be measured from the y direction. Putting these variables together, the set that 
most intuitively describes a heavy particle decay is the "rest-frame" set 

{mj, ai, 02, 7, cos 6*0, 0o}- (2.37) 

In the lab frame, we want to choose variables that are invariant under longitudinal 

boosts and azimuthal rotations. The angle Ai?i2 between the daughter particles is a 
natural choice, as is the ratio of the minimum daughter to the parent pr, which is 
commonly called z: 

z ^ "^^^P^^^P^^l (2.38) 
PTj 
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These variables make the recombination metrics for the kx and CA algorithms simple: 



Note that for a generic recombination, the momentum factors in the denominator of 
Eq. (2.38) and in the kx metric in Eq. (2.39) should be pTp, the momentum of the 
the parent or combined sub jet of the 2 — >■ 1 recombination. 

Prom these considerations we choose to describe recombinations in the lab frame 
with the set of variables 



In using these variables it is essential to understand the structure of the corre- 
sponding phase space, especially for the last two variables in both sets. If we require 
that the decay "fits" in a jet, constraints and correlations appear. These are clearest 
in terms of the lab frame variables Ai?i2 and z. As a first step in understanding these 
correlations, we plot in Fig. 2.5 the contour AR12 — D{— 1.0) in the (cos^o, 0o) phase 
space for different values of 7 and over different choices for oi and 02. These specific 
values of Oi and 02 correspond to a variety of interesting processes: Oi = 02 = gives 
the simplest kinematics and is therefore a useful starting point; oi = 0.46, 02 = 
gives the kinematics of the top quark decay; ai = 0.9, 02 = and Oi = 0.3, a2 = 0.1 
are reasonable values for subjet masses from the CA and kx algorithms respectively. 
The contour Ait!i2 = D defines the boundary in phase space where a 1 — > 2 process 
will no longer fit in a jet, with the interior region corresponding to splittings with 
Ai?i2 < D. Note that the contour is nearly vertical, increasingly so for larger 7. 
This is a reflection of the fact that Ai?i2 is nearly independent of 4>o, up to terms 
suppressed by 7"^. 

While the constraint Ai?i2 < D becomes simpler in the {z, A.R12) phase space, 
the boundaries of the phase space become more complex. In Fig. 2.6, we plot the 
available phase space in {z, AR12) for the same values of xj, ai, and 02 as in Fig. 2.5, 



Pi20^t) = PTjZARu and pi2(CA) = Ai?i2. 



(2.39) 



{mj, ai, a2, xj, z, AR12}. 



(2.40) 
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Figure 2.5: Boundaries in the cos^^o^0o plane for a recombination step to fit in a jet 
of size D — 1.0, for several values of the boost 7 and the subjet masses {ai, 02}. The 
"interior" region has Ai?i2 < D. 
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(c) ai = 0.9, 02 = (d) ai = 0.3, 03 = 0.1 



Figure 2.6: Boundaries in the z-Ai?i2 plane for a recombination step of fixed 
{ai, 02, xj}, for various values of xj and the subjet masses {ai, 02}. Configura- 
tions with Ai?i2 < fit in a jet; D = 1.0 is shown for example. 
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translating the value of 7 into xj. The most striking feature is that for fixed xj, 
oi, and a2, the phase space in {z, /S.R12) is nearly one-dimensional; this is again due 
to the fact that /S.R12 and also z are nearly independent of 0o- In particular, for 
Oi = ^2 = (as in Fig. 2.6a), the phase space approximates the contour describing 

fixed xj for small Ai?i2, which takes the simple form 

2 

nn 

xj= ^ ^ z(l- z)ARlo. (2.41) 
Ptj 

This approximation is accurate even for larger angles, Ai?i2 1, at the 10% level. 
Note also that the width of the band about the contour described by Eq. (2.41) is 
itself of order xj. As we decrease xj the band moves down and becomes narrower as 
indicated in Fig. 2.6a). 

As illustrated in Figs. 2.6b and 2.6d, we can also see a double-band structure to 
the (z, Ai?i2) phase space. The upper band corresponds to the case where the hghter 
daughter is softer (smaller-pr) than the heavier daughter (and determines z), while 
the lower band corresponds to the case where the heavier daughter is softer. This 
does not occur in Fig. 2.6a because ai = 02 (the single band is double-covered), or in 
Fig. 2.6c because the heavier particle is never the softer one for the chosen values of 

Xj. 

We have said nothing about the density of points in phase space for either pair of 
variables. This is because the weighting of phase space is set by the dynamics of a 
process, while the boundaries are set by the kinematics. Decays and QCD splittings 
weight the phase space differently, as we will see in Sec. 4.1, 

Ordering in Recombination Algorithms 

Having laid out variables useful to describe 1^2 processes, we can discuss how the jet 
algorithm orders recombinations in these variables. Recombination algorithms merge 
objects according to the pairwise metric pij. The sequence of recombinations is almost 
always monotonic in this metric: as the algorithm proceeds, the value increases. Only 
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certain kinematic configurations will decrease the metric from one recombination to 
the next, and the monotonicity violation is small and rare in practice. 

This means it is straightforward to understand the typical recombinations that 
occur at different stages of the algorithm. We can think in terms of a phase space 
boundary: the algorithm enforces a boundary in phase space at a constant value of 
the recombination metric that evolves to larger values as the recombination process 
proceeds. If a recombination occurs at a certain value of the metric, po, then subse- 
quent recombinations are very unlikely to have pij < po, meaning that region of phase 
space is unavailable for further recombinations. 

In Fig. 2.7, we plot typical boundaries for the CA and kx algorithms in the 
{z,ARi2) phase space. For CA, these boundaries are simply hues of constant Ai?i2, 
since the recombination metric is pij{CA) — ARij. For kx, these boundaries are con- 
tours in zARi2, and implicitly depend on the pt of the parent particle in the splitting. 
Because the kx recombination metric for i,j p is pijik-^) = zAR^jpTp, increasing the 
value of pTp will shift the boundary in to smaller zARij. These algorithm-dependent 
ordering effects will be important in understanding the restrictions on the kinematics 
of the last recombinations in a jet. For instance, we expect to observe no small-angle 
late recombinations in a jet defined by the CA algorithm. 

2.4-5 Event shapes and jet shapes 

An alternative characterization of hadronic activity is an event shape. Event shapes, 
such as thrust, characterize events based on the distribution of energy in the final 
state by assigning differing weights to events with differing energy distributions.^'^ 
Events that are two-jet-like, with two very coUimated back-to-back jets, produce 
values of the observable at one end of the distribution, while spherical events with 
a broad energy distribution produce values of the observable at the other end of the 



This subsection is taken, witli small modifications, from Sec. 2 of ^25]. 
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Figure 2.7: Typical boundaries (red, dashed lines) on phase space due to ordering in 
the CA and kx algorithms. The shaded region below the boundaries is cut out, and 
the more heavily shaded regions correspond to earlier in the recombination sequence. 
The cutoff ARij = D = 1.0 is shown for reference (black, dashed line). 

distribution. While event shapes can quantify the global geometry of events, they 
are not sensitive to the detailed structure of jets in the event. Two classes of events 
may have similar values of an event shape but characteristically different structure in 
terms of number of jets and the energy distribution within those jets. 

Jet shapes, which are event shape-like observables applied to single jets, are an 
effective tool to measure the structure of individual jets. Just as event shapes are an 
alternative to jets in characterizing an event, jet shapes are an alternative to subjet 
descriptions of jet substructure. These observables can be used to not only quantify 
QCD-like events, but study more complex, non-QCD topologies, as illustrated for 
light quark vs. top quark and Z jets in [52, 53j. Broad jets, with wide-angle energy 
depositions, and very coUimated jets, with a narrow energy profile, take on distinct 
values for jet shape observables. In Chapter 3, we consider the example of the class 
of jet shapes called angularities, defined in Eq. (3.2) and denoted r^. Every value of a 
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corresponds to a different jet sliape. As a decreases, tlie angularity weigfits particles at 
the periphery of the jet more, and is therefore more sensitive to wide-angle radiation. 
Simultaneous measurements of the angularity of a jet for different values of a can be 
an additional probe of the structure of the jet. 
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Chapter 3 

CONSISTENT FACTORIZATION OF JET OBSERVABLES 
IN EXCLUSIVE MULTIJET CROSS SECTIONS 

3.1 Introduction 

Final states that contain several jets are important Standard Model backgrounds 
to many new physics processes in high-energy colliders, in addition to serving as 
sensitive probes of Quantum Chromodynamics (QCD) itself over a wide range of 
energy scales.^ The structure of jet-like final states contains signatures of the hard 
scattering of parton-like degrees of freedom, the branching and showering at ever 
lower energies, and hadronization at the lowest scale Aqcd- Probing the structure 
of jets both teaches us about QCD and can help us to distinguish jets of Standard 
Model origin from those that are truly signatures for new physics. 

The presence of multiple scales governing jets is at once an opportunity to probe 
many aspects of their physics and also a challenge due to the generation of large 
logarithms of ratios of these scales spoihng the behavior of perturbation theory. A 
powerful framework to separate physics at different scales and to improve the behav- 
ior of perturbation series is effective field theory (EFT). EFTs aid in factorizing an 
observable dependent on multiple scales into pieces each sensitive to a single energy 
scale. Renormahzation group (RG) evolution of these pieces in EFT achieves resum- 
mation of large logarithms to all orders in perturbation theory. Factorization also 
allows the disentangling of perturbative and non-perturbative physics [54, 55j. 

Soft-Collinear Effective Theory (SCET) [11, 12, 13, 14J has had considerable suc- 
cess in applications to many hard-scattering cross sections [37] and jet cross sections. 



-'^This chapter, with small modifications, is taken from ^IJ. 
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SCET separates degrees of freedom in QCD into distinct soft and coUinear modes, 
expanding the full theory in a parameter A that characterizes the size of coUinear 
momenta transverse to the jet direction, and provides a framework to factorize cross 
sections into separate pieces coming from interactions at hard, coUinear, and soft 
scales. This was done in SCET for event shape variables using hemisphere jet algo- 
rithms in e'^e~ colliders [56, 57J and for "isolated Drell-Yan" (where central jets are 
vetoed) in hadron colliders [58j . In addition, there has been progress in understanding 
how to implement jet algorithms other than the simple hemisphere jet algorithm in 
SCET. In [59, 60j, total two-jet rates where the jets are defined by Sterman- Weinberg 
jet algorithms were computed at NLO. These results were extended to the cases of 
the exclusive kx and JADE algorithms in [61j. 

In most applications of SCET to exclusive jet cross sections considered to date, 
there are two back-to-back jets. (Recently Ref. [62j considered direct photon pro- 
duction in hadron collisions, involving three coUinear directions.) In this work we 
consider for the first time exclusive A^-jet final states with arbitrary N > 2 for the 
SISCone [63j, Snowmass [64j, inclusive kx [50j, anti-kx [46j, and Cambridge-Aachen 
[51j jet algorithms. We find that a new feature that arises when more than two jets 
are present is that the parameter A is not in itself sufficient to ensure factorization. In 
particular, factorization is valid to leading order in A and in a jet separation measure 
1/t, where 

^ _ tan(^/2) 

*"tan(i?/2)' ^^-^^ 

with R the angular size of a jet as defined by a jet algorithm and ip the minimum angle 
between two jets. This is due to the fact that jets need to be both well-collimated 
(A <^ 1) and well-separated (t 1). The latter requirement is trivial for back-to-back 
jets since 1/i = for = tt. 

Our analysis apphes not only to the total A"-jet cross section, but also in the case 
that jet observables are measured on some number M < A" of the jets. We will 
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illustrate the measurement of angularities (cf. [65, 52j), defined by 




where Ej is the energy of the jet J, the sum is over particles i in the jet, and and rji 
are the transverse momentum and (pseudo-)rapidity of particle i with respect to the 
jet axis. However, most of our results do not depend on this choice of observable, and 
we organize the calculation such that other observables can be easily implemented. 

Distributions of jet shapes such as angularities contain logarithms of Ta that be- 
come large in the limit Tq 0, of the form ln^~^ ra)/ra with k < 2n. The 
factorization theorem we present provides the basis for resummation of sets of these 
logarithms to all orders in ctg. In the exponent. In R{Ta), of the "radiator" R{Ta) — 
(l/(To) J^" dT'g^{da / dr'^) , these appear in the form a'^W^Ta with m < n + 1 [66, 67j. 
Our results here allow us to sum to leading-logarithmic (LL) {m ~ n + 1) and next- 
to-leading-logarithmic (NLL) (m = n) accuracy in this exponent. 

The set of jet shapes Ta contain similar information as the "original" jet shape 
^{r/R) [68, 69, 70j, the fraction of energy of a jet of size in a sub-cone of size r. 
Distributions in this jet shape in hadron collisions were resummed to so-called "mod- 
ified LL" accuracy (which includes the k = 2n and k = 2n — 1 terms as enumerated 
for the distribution above) in [71j. 

Factorization of event shape distributions in SCET was proven in [72, 73j, and fac- 
torization for multijet observables defined with arbitrary algorithms was considered 
in [74j . The extension to the more general case that we consider involves the straight- 
forward combination of the techniques developed in these papers and will be derived 
in detail in [25j. In this work we demonstrate that, after intricate cancellations among 
the various contributions to the jet and soft functions, consistency of the factorization 
theorem is satisfied at NLL accuracy. In order for the factorization theorem to be 
consistent, the hard, jet, and soft functions defined must satisfy a strong condition 
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on their anomalous dimensions: 

0^hH+ 7jj5(ri)---5(rr) 

\ i=M+l ) 

M M (3.3) 

i=l j=l 

for any number N of total jets and M of measured jets, and any color representation 
of each jet. This consistency condition is made even more nontrivial by the potential 
dependence of the jet and soft anomalous dimensions on the jet algorithm parameters. 
In this chapter we demonstrate that Eq. (3.3) does in fact hold for arbitrary numbers, 
types, and sizes of jets in the final state, up to certain power corrections we are able 
to identify. 

Observables measuring jet shapes like r^, while also restricting the phase space 
into which soft gluons can be emitted, can be plagued by "non-global" logarithms [75j 
beginning at NLL order that may not rcsummed by our methods. In particular there 
can be logarithms in our jet shape distributions generated by the energy cut A that 
we place on soft radiation outside jets [25j. Ref. [76j demonstrated the factorization 
of similar distributions into global and non-global parts. Our results here allow the 
resummation of logarithms of Tq in the global part. More simply, the non-global log- 
arithms can be removed by choosing A ~ EjTa [65j. In [25j wc address resummation 
in the case that these scales remain disparate. Despite these potential complications, 
which deserve additional study, our demonstration of a consistent factorization the- 
orem for jet shapes defined with a jet algorithm provides a key advance towards the 
resummation of any such jet shape distributions. 

We begin in Sec. 3.2 by defining the phase space cuts needed to implement our 
choice of jet algorithms. In Sec. 3.3 we then present the factorization theorem for N- 
jet events and define the hard, jet, and soft functions, and identify power corrections 
to the factorization. In Sec. 3.4 we give the form of the RG evolution equations 
obeyed by the factorized functions. In Sec. 3.5 we summarize the results of all the 
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anomalous dimensions needed for NLL running and demonstrate how they intricately 
satisfy the consistency condition Eq. (3.3). This requires calculating only the infinite 
parts of the bare functions. We give the finite pieces of the jet and soft functions 
(which are not needed at NLL) in [25j. In Sec. 3.6 as an example we calculate quark 
and gluon angularity jet shapes in 3-jet final states with logarithms of Tq resummed 
to NLL accuracy. 

3.2 Phase Space Cuts and the Jet Algorithm 

Two general categories of jet algorithms, cone algorithms and recombination (kx-type) 
algorithms, are commonly used to find jets. For a jet composed of two particles, as in 
a next-to-leading order description, the phase space constraints implied by each type 
of algorithm become very simple. In this work we deal with the common forms of 
cone and (inclusive) kx-type algorithms; our cone algorithms include the Snowmass 
and SlSCone algorithms, and our recombination algorithms include the inclusive kx, 
Cambridge- Aachen, and anti-kx algorithms. Cone algorithms require each particle to 
be within an angle it! of the jet axis, while recombination algorithms require the angle 
between the two particles to be within an angle D of each other. If we label the jet 
axis as n and its constituent particles as 1 and 2, then the algorithm constraints for 
a two-particle jet are: 

cone type: ^in < R and < R, 

(3.4) 

kx type: 9i2 < D. 

For the parts of the jet and soft functions that wc give in this work, we find that 
the functional form is the same for cone-type and kx-type algorithms in terms of the 
angular parameter R or D. Therefore, we will use the more common R in writing 
down the jet and soft functions, but we note here that the functional form is the same 
for kx with the replacement R ^ D. 

Note that, while all algorithms that we consider fall into one of the two constraints 
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in Eq. (3.4) at NLO, at higher orders the various algorithms will behave differently. 
Without taking this into account, we have no guarantee that we can resum all loga- 
rithms of jet algorithm parameters correctly.^ This is not a problem we solve in this 
paper. In this paper, we resum logarithms of jet observables in the presence of phase 
space cuts due to an algorithm, demonstrate that the factorization theorem and NLL 
running are valid and consistent, and identify the power corrections to this statement. 

At the hard scale, we match an A^-leg amphtude in QCD onto an A^-jet operator in 
SCET, meaning we must enforce that the number of jets is fixed to be N. To enforce 
that we have no more than N jets, we require that the total energy of particles that 
do not enter jets to be less than a cutoff A. To enforce that we have at least A^ jets, we 
need that pairwise each jet is well separated from every other jet. The requirement 
of consistency of NLL running will give a quantitative measure of this separation 
requiring that t ^ 1. 

3.3 Factorized Jet Shapes in N-Jet Production 

The cross section for e+e" annihilation to A^ jets at center-of-mass energy Q, differ- 
ential in the jet three-momenta Pj of the jets and in the shapes of M of these jets, is 
given in QCD by 

da 



drl ■ ■ ■ dr^d^Fi ■ ■ ■ d^FN 

= ^E(2^)''^'(^-^^)K^I^''(0) 10)^/^1' (3.5) 

M N 

i=l j=l 

where Ji is the ith jet in X identified by the jet algorithm J'. The Kronecker delta 
restricts the sum over states to those that are identified as having A^ jets by the 
algorithm. The final state is produced by the QCD current — qj^q, and is the 
leptonic part of the amplitude for e'^e" — > 7*. 

^The kx algorithm, for exanrple, is known to spoil naive exponentiation J7y 
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To factorize the cross section Eq. (3.5), we begin by matching the QCD current 
j^^ onto a set of A'"-jet operators in SCET. These operators are built from quark and 
gluon jet fields, 

Xn = WlU, B^ = \wl{V^ + Ai)W^, (3.6) 

where An are coUinear quark and gluon fields in SCET, and is a Wilson line 
of the 0{1) component n- A^ol coUinear gluons, 

Wn{x) = J2 \-^ri ■ Anix)] . (3.7) 

perms 

We have made use of the label operator which picks out the large 0{\) n • p and 
0{X) p± components of the label momentum p of coUinear field in SCET. We will 
not need to construct the A'"-jet operators expUcitly, but bases of 2, 3, 4 jet operators 
have been given in [37, 78, 79j, respectively. 

To describe an iV-jet cross section, we construct an effective theory Lagrangian 
by adding N copies of the coUinear Lagrangian in SCET (in different light-cone 
directions rii) together with one soft Lagrangian. In each coUinear sector, we redefine 
coUinear fields by multiplying by Wilson lines of soft gluons to eliminate the coupling 
of soft gluons to coUinear modes in the leading-order SCET Lagrangian [14J, ^„ = 
F„t^i°) and An = ynA^n\ where 



Yn{x) ^ Pexp 



poo 

I dsn-As{ns + x) 
Jo 



(3.8) 



with Ag in the fundamental representation, and y similarly defined but in the adjoint 
representation. 

Performing the above steps in Eq. (3.5) for the jet shape distribution, the details 
of which we report in [25j, we obtain the factorized form in SCET, 



da da^^) ^ 



M 

X 

j=l 



-i/(Pi,...,p;v) n 



M „ 

n y rfr} rfr^ <5(T^ - - 7i) J;J,,,(T^)5(7i, . . . , ri^). 



(3.9) 
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where a^^^ is the Born cross section for e+e" N partons, H — 1 + 0{as) is the 
hard coefficient given by the matching coefficient of the SCET A^-jet operator, and J 
and S are jet and soft functions. The superscripts fi denote the color representation 
(corresponding to a quark, antiquark, or gluon) of the jet corresponding to the ith leg 
in the iV-jet operator. We number the legs so that i = 1, . . . ,M are the jets whose 
shapes we measure, and the remainder j — M + 1, . . . , N are left unmeasured. 

The quark and gluon jet functions for jets whose shapes are measured are defined 



X (0| x„A^) \X„) (X„| x„AO) |0) i(rj - T„( J(X„))), 

X 7^ (0| 9Bi^M \X^) (X.I 9B^(0) |0) 6{rj - r„( J(XO)), 



(3.10a) 



(3.10b) 



D-2 

where the traces are over color and spinor indices, and D is the number of dimensions. 
The sums are over states in the n-coUinear sector. The label direction and energy 
n, u arc chosen to match the jet momentum P. We have factored the Kronecker delta 
in the full cross section Eq. (3.5) restricting the sum over states to those with N jets 
according to the algorithm J into individual restrictions that there is precisely one 
jet in each coUinear sector. The delta functions of tj restrict the angularity of the 
jet J identified in the state by the jet algorithm. The jet functions Jn^ .^j for jets 
whose shapes are left unmeasured are given by Eq. (3.10) without the delta functions 

of Tj. 

The soft function, meanwhile, is given by matrix elements of soft Wilson lines 
in each of the coUinear directions rii and color representations n of the ith jet. For 
arbitrary N , multiple color structures may appear, and if so there is an implicit sum 

■^Thc normalization of Eq. (3.10a) has been changed by a factor of 1/2 to agree with the definition 
in l25j, where J^ija) = 1 + 0{as). 
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over multiple hard functions H and soft functions S in Eq. (3.9). An A^-jet soft 
function takes the general form 

1 ^ 

^x. i=\ (3.11) 

X (o| y:-J . . . y„7t(o) \x,) {x,\ y:i ■ ■ ■ y;;(o) |o) , 

where J\f normalizes the soft function to S{t^) ■ ■ ■S{t^) at tree level. There is an 
implicit contraction of color indices which we have left unspecified. The whole soft 
function is color singlet. Note that the sum over soft states is restricted so that soft 
particles do not create an additional jet when the jet algorithm is run on Xs. t*(Xs) 
is the contribution to the jet shape from soft particles which are actually in the jet 
Ji. 

The factorization of the cross section Eq. (3.9) is vahd in the following limits of 
QCD: 

1. The SCET expansion parameter A, determined either by the jet shape Ta for 
measured jets or the jet radius R for unmeasured jets, must be small. In other 

words, each jet must be well collimated. 

2. The separation between any pair of jets must be large. We will find that the 
natural measure for this separation is the variable t — tan('0/2)/tan(i?/2), 
where ip is the minimum angle between two jet directions, t must be large, 
that is, jets must be well separated in order for us to factor the A"-jet condition 
in the full cross section Eq. (3.5) into N individual 1-jet conditions in each 
coUinear sector as in Eq. (3.10) and a no-jet condition in the soft sector as in 
Eq. (3.11). This approximation is inevitable because each jet function Jj already 
approximates all radiation emitted by other jets as coming from a Wilson line 
Wjii along the exactly back-to-back direction fij, whereas the hard and soft 
functions know the directions of all jets exactly. 
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3. The energy of all particles not included in a jet must be of the order of soft 
momenta. This is so that setting the label energy on each of the jet fields in 
Eq. (3.10) to be equal to the total jet energy is correct at leading order in A. In 
particular, the energy cut parameter A on energy outside of all jets is required 
to be soft, A ~ X'^Ej. 

4. Power corrections associated with the jet algorithm are small. For instance, 
setting the jet axis equal to the label direction n is valid up to O(A^) corrections, 
which induce corrections to the jet shape which are subleading for a < 1 
[65, 73, 80j . Similarly, assuming soft particles know only about the total coUinear 
jet momentum by the time they are included or excluded from a jet induces 
power corrections to that are power suppressed for sufficiently large R. 

We go into greater detail about these approximations in [25j. 

3.4 Renormalization Group Evolution 

The functions that we consider either renormalize multiplicatively or through con- 
volutions in T. The multiplicative form of a renormalization group equation (RGE) 
obeyed by a function F is 

pi-^F{pi)^^F{pi)F{pi), (3.12) 
with the anomalous dimension of the form 

'jFifi) = ^F[a] ^ + 7f[q;]- (3.13) 

This RGE has the solution 

Fii^)^UF{^i,^io)F{^o), (3.14) 

where 

C/.(,,^„)=e-'(-.,»)(/f£)"'"""', (3.15) 
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where we define up, Kp below in Eq. (3.20). The convolved form of an RGE obeyed 
by functions F that depend on the observable is 

where to all orders in a [56, 81j 



(3.16) 



^f{t]ij) = I Vpia] In^ + 7ir[a] ) 5{t) - ^Vpia] 



JF 



r 



The solution to this RGE is [56, 82, 83, 84, 85j 



F(r;/x) = j dT'UF{T-T'-ii,iio)F{T'-iio), 



where 



Uf{t]h,Ho) 



T-l+OJp 



(3.17) 



(3.18) 



(3.19) 



+ 



r(— wf) \uj J 

We note that the anomalous dimensions 7f(a*) and 7f('''; /i) in general also depend 
on the jet algorithm parameters R and A which we have made implicit. 

The part of the anomalous dimensions in Eqs. (3.13) and (3.17) multiplying 
ln(/x^/a;^) is proportional, to all orders in a^, to the cusp anomalous dimension T{as), 
given to 0{as) by r{as) — cts/vr. With one-loop results for the anomalous dimensions, 
and using the two-loop form of the cusp anomalous dimension, the RGE solutions are 
accurate to NLL order. In Eqs. (3.15) and (3.19), uf,Kf are given by 



2 r^^'^) da ^ , , 

jF pH 

KFil^,l^o) = / 

J as 



(3 [a 



■lF[a\ 



+ 2 



"^(''^ da 



TF[a] / 



da 



(3.20a) 



(3.20b) 



where /3[a] is the beta function of QCD. We define Jf — 1 for RGEs of the form 
Eq. (3.13). 

We will find that the hard function can be written as a sum over functions that each 
obey a multiplicative renormalization group equation. The unmeasured jet function 
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also obeys a multiplicative RGE, while the measured jet function obeys a RGE with 
a convolution over r. The soft function, whose structure we will discuss in detail, can 
be decomposed into terms which obey multiplicative RGEs and terms which obey 
convolved RGEs. 

In the next section wc outline the calculations necessary to obtain all the above 
anomalous dimensions to 0{as)- 

3.5 Anomalous Dimensions and Consistency of Factorization 

In this section we discuss the calculation of the one-loop hard, jet, and soft anomalous 
dimensions and the form of the anomalous dimensions in Table 3.1 and demonstrate 
that the consistency condition, Eq. (3.3), is satisfied to one-loop order, to leading 
order in the approximations wc enumerated above. This is ah^cady an intricate test 
whose satisfaction turns out to be highly nontrivial. Having verified this condition, 
we proceed at the end of the Letter to give an application of NLL resummation of 
the jet shape distribution made possible by our one-loop calculation of the anomalous 
dimensions. 

3.5.1 Hard Function 

The hard function H in the factorized cross section Eq. (3.9) is given by the square 
of the Wilson coefficient in the matching of the iV-parton amplitude in QCD onto an 
A^-jet operator in SCET, 

{N\qrq\0)^{N\CMON\0), (3.21) 

where the right-hand side is actually a sum over many possible A'"-jet operators built 
from the jet fields in Eq. (3.6) and soft Wilson lines Eq. (3.8). The allowed basis of 
operators Om is determined by gauge and Lorentz symmetry. If there is only one 
operator, the hard function is simply H = \Cn\ ■ 
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The one-loop anomalous dimension of the A'^-jet matching coefficient Cjv can be 
determined from calculations existing in the hterature, for example, Table III of [86j. 
For an operator with N legs with color charges Tj, the anomalous dimension of the 
matching coefficient is 

N 



(3.22) 

--r(a,)^T,-T,ln(^^ J 

where 7^ is given to 0{as) for quarks and gluons by 

T,= — . 7, = - (3.23) 

The anomalous dimension of the hard function itself is then given by 'fn = Icn + Icn 
and can be written as 

N 

7i^(/^) = E^^(/^)+^™- (3-24) 

i=l 

Because the hard function obeys a multiplicative RGE, each term in the hard function 
obeys a multiplicative RGE, and so each term in Eq. (3.24) has the form Eq. (3.13). 
Each Wha&u^ Ui, while V[a\ = for as hsted in Table 3.1, 

3.5.2 Jet Functions 

The quark and gluon jet functions are given by Eqs. (3.10a) and (3.10b) and are 
calculated from cutting all possible diagrams at a given order in as correcting a 
coUinear propagator with label momentum con. The jet functions include phase space 
restrictions on the final-state particles from the cut requiring that only one jet is 
produced. When we cut through a single propagator, the solitary parton in the final 
state is automatically in the jet, but these diagrams turn out to be scaleless and thus 
zero in dimensional regularization. For the cuts through loops, two coUinear particles 
are created in the final state, and both particles are in the jet if Eq. (3.4) is satisfied. 
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If Eq. (3.4) is not satisfied, we require one of the particles to have energy £^ < A, so 
that only one jet is produced by the final state. Additionally, for jets whose shapes 
are measured, we include a delta function, 5{tj — Ta{J{X))), measuring the jet shape 
for the particles in the jet. The restrictions on unmeasured jet functions are the same 
as the measured jets except for this delta function. 

We report here the results of calculating only the infinite parts of the relevant loop 
graphs in dimensional regularization, in D — 4 — 2e dimensions, in the MS scheme. 
We give the finite parts in [25j . Our calculations give anomalous dimensions for quark 
and gluon jets 7} of the form Eq. (3.13) for unmeasured jets and 7j(Ta) of the form 
Eq. (3.17) for measured jets, with the values given in Table 3.1, 

In the measured jet function, we find that the zero-bin subtraction plays a key 
role. The zero-bin subtraction removes doubly-counted regions of phase space from 
the "naive" contributions to the jet function [36j. For the measured jet functions, 
the naive contributions to the anomalous dimension only depend on 5{Ta) and do not 
contain (l/ra)+ distributions. However, the zero-bin contribution to the anomalous 
dimension contains non-trivial dependence away from = 0, and it is only by per- 
forming the zero-bin subtraction that we obtain the correct running of the measured 
jet function. 

When the final-state particles in the jet function do not pass the cuts in Eq. (3.4), 
only one particle is in a jet. In this case the contribution to the jet function is power 
suppressed by 0{A/u), since a coUinear parton must have < A to be outside of 
the jet. This power contribution is not power suppressed in the naive contribution 
alone, but only after the zero-bin subtraction. Additionally, the zero-bin removes the 
dependence of the measured jet function anomalous dimension on the jet algorithm 
parameter R. For unmeasured jets, the zero-bin is a scaleless integral, and the R 
dependence remains in the unmeasured jet function. 
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Table 3.1: Anomalous dimensions of hard, jet, and soft functions. The cusp parts 
Vp and non-cusp parts ^(p of the anomalous dimensions for hard, unmeasured jet, 
measured jet, and soft functions are given, along with the constant jp appearing 
in Eqs. (3.17) and (3.20a). F is the cusp anomalous dimension, given to one-loop by 
r = ccs/tt. The pieces 7^ for quarks and gluons are given by Eq. (3.23). The three rows 
for the soft anomalous dimensions are organized to correspond to the three groups of 
evolution factors given in Eq. (3.32) and are given in the limit 1/i^ — > 0. 



Tabulating the results, we find the anomalous dimensions are 



7j, = r(a,)T^ln 



2~Z 9 R 7?) 



(3.25) 



for unmeasured jet functions, and 



T{as)- ^In;^ +7i 



2-a, 

m — 

1 — a uj. 



Tn. 



+ 



for measured jet functions. 



(3.26) 
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3. 5. 3 Soft Function 

The soft function in an A^-jet cross section is given by Eq. (3.11), containing matrix 
elements of soft Wilson lines in the jet directions, with each Wilson line in the 
color representation of the corresponding jet. At 0{as)i this soft function is given 
by a sum over cut diagrams represented in Fig. 3.1. The blob represents the jet in 
direction n^, and we leave implicit the phase space cuts needed for each diagram. We 
use Feynman gauge, in which each diagram is proportional to ni-nj. (Note this allows 
us to drop graphs with i = j or i = k since nf = 0.) 

To calculate the soft function, we must implement phase space cuts on the soft 
gluon in the final state requiring that it either be in a jet or not produce a new jet 
(i.e., it has energy less than A). The soft function is a sum over contributions from 
all pairs of directions i and j that exchange the soft gluon, and we calculate the total 
contribution with i and j fixed before summing over directions. A natural way to 
organize the phase space of the soft gluon in the final state is as follows: 

(1) The gluon enters a measured jet and contributes to r^'(Xs). 

(2) The gluon enters an unmeasured jet and has any energy. 




Figure 3.1: Soft function diagrams. A gluon exchanged between jets i and j crosses 
the cut which imposes phase space restrictions due to the jet algorithm. The blob 
represents the jet in direction k, which the gluon may enter or not. 
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(3) The gluon is not in any jet and has energy E < A. 

We name contribution (1) S^^^{t^), where the subscript ij denotes that the gluon 
goes from i to j. Regions (2) and (3) do not contribute to the angularity of any jet 
and just give an additive contribution 5'non-meas coefficient of 5(t„) • • • 5{t^^) in 

the full soft function S{tI,. . . ,t^). Contribution (3), however, is very awkward to 
calculate, as we must integrate over a phase space with many "holes" (corresponding 
to the jets) removed, resembling Swiss cheese. It is easier to reorganize contributions 
(2) and (3) into the following form: 

(A) 5'-^^': the gluon is anywhere with energy E < A. 

(B) S^j-. the gluon is in jet k with energy E > A. 

(C) S^j-. the gluon is in jet k with energy E < A. 

Then, the unmeasured soft gluon contribution SfJ^™^^ (the sum of (2) and (3) in the 
original list) is given by the combination 

N M 

S—^^S^^+ ^-E-^- (3-27) 

k=M+l k=l 

In the first term, coming from region (A), we filled in the holes in the Swiss cheese- 
like region (3) in the original list, allowing the soft gluon to go anywhere with energy 
E < A. We compensated by adding the second term given by region (B) containing 
gluons with energy E > A inside unmeasured jets (part of the original region (2)) 
and subtracting the third term from region (C), removing gluons with E < A inside 
measured jets, which are already correctly accounted for in 5'°'''^^(r^). 
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The total soft function at 0{as) is then given by 



M 



M 



tmeas 



i^j k=l 



(3.28) 



M 



runmeas 



1=1 



Note that the second hne is independent of the jet shape. This contribution is uni- 
versal and will appear in any A'"-jet cross section in which some of the jets defined by 
a particular jet algorithm are not measured. 

The contributions of the measured jet piece S^^^{t^) to the anomalous dimension 
of the soft function are given in Table 3.2 separately in the cases that k — i or 
j and k ^ These contributions are given by the form Eq. (3.17), with the 

values given in Table 3.2, The results are given in terms of the distance measure 
tij = tan('i/'jj/2)/ tan(i?/2) between jets of size R separated by an angle ipij, and the 
angle between the ik and jk planes. For well-separated jets, the contributions to 
the non-cusp part of the anomalous dimension are suppressed by 

The "inclusive" contribution Sl^'^^ for a soft gluon going anywhere with energy 
E < A contributes a term to the soft anomalous dimension given by the general form 
Eq. (3.13), with vahics given in Table 3.2. 

Finally, for the contributions of soft gluons entering jets with E > A or E < A 
in (B) and (C) in the list above, we can combine the last two terms in Eq. (3.27) 
using the following observation. The sum S^j + Sf^ is the contribution of a soft 
gluon entering jet with any energy. The phase space integral for this contribution 
contains a scaleless integral (of energy from to cxd), and so this sum is zero in pure 
dimensional regularization. Thus we can set S^^ — —S^j, and the last two terms in 
Eq. (3.27) add up to the contribution of a soft gluon entering any jet with energy 
E > A. These contributions can again be split up into those with k — i or j and 
k 7^ They contribute parts to the soft anomalous dimension falling into the form 
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Eq. (3.13), with values in Table 3.2, The non-cusp pieces are again suppressed by 
for well-separated jets. 

Using the contributions described above, we sum over directions i and j and obtain 
the anomalous dimensions for 5'™'^^(t^) and 5'"'^™*'^*^ which we record in Table 3.2. 

The soft function obeys the renormalization group equation 



X 1s{ti -t[,...,tm- t'j^; fi). 



(3.29) 



Because the soft function at 0{as) in Eq. (3.28) is a sum of terms that depend non- 
trivially on at most one jet shape, the anomalous dimension can be decomposed as 



75(ti, . . . , tm; /x) = 7r"'^(/^) 5(n) ■ • ■ Htm) 

M M 



k=l 0=1 



The non-cusp parts of the anomalous dimension of S"^^^^ and S^'^"^^'^^ share the same 
dependence on r, and therefore we are free to shift non-cusp terms freely between 
anomalous dimensions. While this does not change the physics, it allows us to organize 
the anomalous dimensions to match the contributions in Table 3.1 , which we find more 
convenient for assembling the solution to the soft RGE Eq. (3.29). By making the 
non-cusp part of 5'™'^^^(r^') zero, we find that the shifted S'™*''^'^(r^) is equal to S^{t^) 
from Table 3.1, and that the shifted 5— is equal to ^p*''' + Y.i S'- 

Finally, we can give the soft function anomalous dimension. Omitting terms which 



60 







7ir[a] 




1 "pT-i rp 1 

2I ±i ■ 


^PT- T in V / ; 


cmeas f^k\ 
'^ij Va ) 





1 , t^, t^i.— 2t,i-t,i- cos flio+l 


cincl 


-TTi ■ Tj 


rTi.T,(ln(nrn,/2) + ln^) 


qi 


-FT- • T • 


-irT..T,(lnW^+ln^) 


ok 





1 t'^jt'^; —2tjh.t^h.cos3i4-\-l 
1,111 (t2^,-l)(t^^^_l) 




p 1 rp2 

^ 1-a^k 


-FT^lntan^f 


^unmeas 





rEi^,T,-T,ln(n,-n,/2) 
+r Eili Tf In tan2(i?/2) + ©(l/t^) 



Table 3.2: Soft anomalous dimensions. Contributions to the anomalous dimension 
of the soft function are given for soft gluons emitted by jet i or j and entering jet A; 
(with A; = i or j' in the first row and k i,j in the second) and being measured with 
angularity r^; soft gluons emitted by jet i or j in any direction with energy £^ < A in 
the third row; and soft gluons emitted by jet i or j and entering jet k and angularity 
unmeasured in the fourth [k — i or j) and fifth {k ^ rows. In the second-to-last 
row we summed the first two rows over all pairs of jets i,j to obtain the measured 
contribution for a specific r^, and in the last row, we summed all unmeasured soft 
gluon contributions. In the last two rows, we have taken the large t limit, jp = 1 in 
all cases. 
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are suppressed by 0{l/t^), the soft function anomalous dimension is 



M 
k=l 



cot 



N 



R 



rii ■ rij 



+ J2 T^lntan2- + J]T,.T,ln ^ 

i=M+l ij^j 



x<^(rl)---5(rr) 



1 ^ 

+ 2rK) 

k=l 



M 



+ 3=1 



The solution of the RGE is 



M N 



k=l 



i=M+l 



(3.31) 



(3.32) 



where Ug{Tk) is an evolution kernel of a convoluted RGE and is of the form in 
Eq. (3.19), and Ug and f/^^"^ are evolution kernels of multiplicative RGEs and are 
of the form in Eq. (3.15). The evolution kernels Ug{Tk), Ug, and Ug'^^^ correspond to 
the soft anomalous dimensions from S^{t^), S'^, and -S'p^^'^ in Table 3.1, 



3.5.4 Consistency of Factorization 

Adding together all jet and soft anomalous dimensions, we find, miraculously, the 
R dependence cancels between the unmeasured jet anomalous dimension Eq. (3.25) 
and sum over unmeasured jets in the soft function Eq. (3.31), and the Ta 7^ de- 
pendence cancels between the measured jet anomalous dimension Eq. (3.26) and the 
sum over measured jets in the soft function. The remaining pieces precisely match 
the hard anomalous dimension given in Sec. 3.5.1 such that the consistency con- 
dition Eq. (3.3) is satisfied. Note, however, that satisfying Eq. (3.3) exactly required 
that we drop corrections of 0{l/t^) in the soft function. Requiring consistency of the 
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anomalous dimensions at one loop has provided the measure ^ 1 to quantify the 
condition we used in justifying the factorization theorem in Sec. 3.3 that jets be "well 
separated" . 

3.6 Application: Jet Shapes in e^e~ — > 3 Jets 

As an example of using the above results to calculate a jet observable in an exclusive 
multijet final state, we give the resummed angularity jet shape distribution for a 
single measured quark or gluon jet in a three-jet final state in e'^e" annihilation. The 
techniques to derive and solve the RGEs to resum logarithms in jet shape distributions 
in SCET arc essentially identical to those for event shape distributions as performed 
in [56, 57, 87, 88j. 

We assemble the appropriate RG-evolved hard function, measured jet function, 
two unmeasured jet functions, and soft function given in Sees. 3.4 and 3.5, Evolving 
these from their tree- level values at initial scales fin, fJ'j, fJ^s to the scale /j, with NLL 
running, we obtain the distribution in the shape Ta of jet 1 with jets 2, 3 unmeasured. 
Written as the derivative of the radiator, 

1 (jcTp^p^Pa _ dR{Ta) 
"^PiPaPa 



X 



X 



u>h{iJ.,IJ.h) / ,,1 \ (2-a)u;l(/i,;il) / o \ "^jCm.Mj) / ,,3 \ "^jCm-Mj) 



(3.33) 



u;i(Ai,Aii)+w^(//,/is) 
la 



where ap^p^p^ is the cross section differential in the three jet momenta = uJiUj,, 
the effective hard scale uh = (u'^'upup)'^ where = + T| + T|, and JC is 
the sum of the hard, jet, and soft evolution factors, 

3 

JC = Kh(i^, ixh) + //S) + K'silJi. l^s)\ + i^f /i5). (3.34) 



i=l 
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Inspection of Eq. (3.33) suggests the reasonable choices for initial scales to minimize 
large logarithms.^ 

it! 

/^if = i^H, l^j = u;iTy(^""\ = UJ2,3 tan — , /is = i^iTa- (3.35) 

For the unmeasured jet scales /Xj^ we kept in mind the factor of In tan^ =| present in 
Kj (see Table 3.1). To obtain the shape of a quark or gluon jet from Eq. (3.33) we 
designate jet 1 as either quark or gluon and plug in the appropriate color factors and 
anomalous dimensions from Table 3.1 into oup and Kp appearing in Eq. (3.33). We 
report on a more detailed phenomenological study of these jet shapes in [25j and their 
application to the discrimination of quark vs. gluon jets in future work. 

3. 7 Summary 

Wc have demonstrated the intricate fashion in which the factorized cross section to 
produce exclusive A^-jct final states when M < N are measured with a jet observable 
remains consistent for NLL running. We identified sources of power corrections to 
this factorization theorem and the consistency condition. Up to these corrections, the 
factorization theorem remains consistent independently of the number of measured 
and unmeasured jets and number of quark and ghion jets. 

One novel power correction that explicitly manifested itself in our calculation is 
in the separation parameter t. Since 1/t is identically zero for all jet sizes when jets 
are back-to-back, this parameter has not been identified in the literature before. 

We find that, when a jet measurement is performed, the NLL resummed result has 
no dependence on the jet algorithm across the algorithms we considered (the Snow- 
mass and SISCone cone algorithms and the inclusive kT, anti-kx, and the Cambridge- 
Aachen kx-type algorithms). In addition, for unmeasured jets the dependence on the 

^There are also phase-space logarithms ln(/^s/A) in the finite part of the soft function 25 which 
are not resummed by the choices Eq. (3.35). These logarithms can be minimized by choosing 
A oJiTa or, when these scales are disparate, by performing a further factorization of the soft 
function as we explain in 25J. 
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jet algorithm parameter R (or D) is universal across these algorithms at NLL. 

Jet shapes such as angularities can be used to describe the substructure of a 
jet, and can be used, for instance, to distinguish quark jets from gluon jets. In a 
future publication we will develop and describe a strategy to do so. We presented 
our calculations in such a way that allows for straightforward adaptation to other 
measurements as well, as we separated those parts of the jet and soft function that 
depend only on the jet algorithm and not the choice of jet observable. In addition, 
the ideas we discussed such as the power corrections that arise in the factorization 
formula and the method of calculating the soft and jet functions, will carry over to 
a calculation involving jet algorithms at hadron colliders, essentially amounting to 
having algorithm parameters that are invariant under boosts along the beam axis. 
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Chapter 4 

JET SUBSTRUCTURE, THEORY AND PRACTICE 

A jet traditionally has been thought of as a proxy for a high-energy parton, e.g., 
a quark or gluon produced in a high-energy proton collision. As an example of this 
approach, consider a measurement of the top quark mass at the Tevatron. At the 
parton level, the production of a top-antitop pair looks like Fig. 4. 1 , Each top quark 
decays to aW boson and a bottom quark. The W can then decay either to a pair of 
quarks or to a charged lepton-neutrino pair. In this case one W decays leptonically, 
and one decays to quarks. At this level of description, the outgoing particles include 
two bottom quarks and two other quarks (a u and a d, say). These quarks will shower 
and hadronize, leading to jets. A reconstruction analysis forms jets and matches jets 
to partons. To the extent that the showers from each quark are independent and well 
separated, the total four-momenta of these jets will correspond to the four-momenta 
of the partons. 

Consider, however, the case where the top quarks are produced with energies 
much larger than their mass. They will be highly boosted, and their decay products 
will move closer together in the lab frame. As the angular distance between partons 
becomes comparable to the characteristic size of their shower, they will not in general 
appear as distinct jets. A hadronic top quark decay might appear as two or even one 
jet instead of three. 

When a top quark decay can be modeled as producing three jets, we can search 
for three jets, "assign them" to the partons of the decay, and then proceed with the 
analysis as if we are talking about partons instead of jets. All of the subtleties of 
the QCD shower, hadronization, etc. are hidden in the jet-to-parton matching step. 
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Figure 4.1: A parton-level description of the production of a top-antitop pair of 
quarks. Each top quark decays to a boson and a bottom quark. The W can then 
decay either to a pair of quarks or to a charged lepton-neutrino pair. In this case one 
W decays leptonically, and one decays to quarks. The result is described as an event 
with an electron, missing energy (from the invisible neutrino), and "four jets". 
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But when a top quark decay appears as a single jet, simply searching for one jet and 
calling it a "top quark" — the same way we assign a light jet to a light quark — 
throws away information. In addition to the four-momentum of the top quark, we 
also have information about its decay, for example that a real W boson should be 
present. To search for a top quark jet, then, we should use our knowledge of the top 
quark's decay to look at the substructure of the jet we think may be a top quark. 

In principle, any heavy particle that decays to light quarks or gluons can be suf- 
ficiently boosted to be observed as a single jet. To identify these decays, we should 
look for jets with the appropriate substructure. The most important background to 
jets from heavy particle decays will be pure QCD jets. Although QCD jets tend to 
be light, the tail of their mass distribution combined with their enormous production 
cross section mean that they will be a background to essentially any jet signature. 
Separating heavy particle jets from this background will require a thorough under- 
standing of the substructure we expect from both types of jets. In the next section, 
we will take some first steps in this direction by working out parton-level predictions 
for the substructure of jets arising from pure QCD as well as the decays of heavy par- 
ticles. In subsequent sections, we will consider how showering, jet reconstruction, and 
splash-in modify these predictions and constrain our ability to distinguish different 
types of jets. 

4.1 Parton-level predictions 

Understanding the detailed substructure of jets presents an interesting challenge.^ 
QCD jets are typically characterized by the soft and coUinear kinematic regimes 
that dominate their evolution, but QCD populates the entire phase space of allowed 
kinematics. Due to its immense cross section relative to other processes, small effects 
in QCD can produce event rates that still dominate other signals, even after cuts. 

"'^This section, with small modifications, is taken from Sections III and IV of [2_. 
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Furthermore, the full kinematic distributions in QCD jet substructure currently can 
only be approximately calculated, so we will focus on understanding the key features 
of jets and the systematic effects that arise from the algorithms that define them. 
Note that even when an on-shell heavy particle is present in a jet, the corresponding 
kinematic decay(s) will contribute to only a few of the branchings within the jet. QCD 
will still be responsible for bulk of the complexity in the jet substructure, which is 
produced as the colored partons shower and hadronize, leading to the high multiplicity 
of color singlet particles observed in the detector. 

It is a complex question to ask whether the jet substructure is accurately recon- 
structing the parton shower, and somewhat misguided, as the parton shower repre- 
sents colored particles while the experimental algorithm only deals with color singlets. 
A more sensible question, and an answerable one, is to ask whether the algorithm is 
faithful to the dynamics of the parton shower. This is the basis of the metrics of the 
kx and CA recombination algorithms — the ordering of recombinations captures the 
dominant kinematic features of branchings within the shower. In particular, the cross 
section for an extra real emission in the parton shower contains both a soft (z) and a 
coUinear (Ai?) singularity: 

, , dzdAR . ^ 

dan+i ~ d'^n — -^- (4.1) 

While these singularities are regulated (in perturbation theory) by virtual corrections, 
the enhancement remains, and we expect emissions in the QCD parton shower to 
be dominantly soft and/or coUinear. Due to their different metrics, the kx and CA 
algorithms will recombine these emissions differently, producing distinct substructure. 
In the rest of this section, we will consider some generic features of jets and jet 
substructure. We will elaborate on this discussion in following subsections, where we 
will contrast the features of jets arising from heavy particle decays with those from 
pure QCD showering. 
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4.1.1 A simple model for QCD substructure 

To establish an intuitive level of understanding of jet substructure in QCD we consider 
a toy model description of jets in terms of a single branching and the kinematic 
variables xj, z, and AR12 (introduced in Sec. 2.4.4). We take the jet to have a 
fixed ptj. We combine the leading-logarithmic dynamics of of Eq. (4.1) with the 
approximate expression for the jet mass in Eq. (2.41), and we label this combined 
approximation as the "LL" approximation. Recall that this approximation for the jet 
mass is useful for small subjet masses and small opening angles. From Section 2.4.4, 
recall that fixing xj provides lower bounds on both z and AR12 and ensures finite 
results for the LL approximation. This approach leads to the following simple form 
for the xj distribution, 

1 ddLL 1 dcTLL 



a d{w?j /p"^^) a dxj 

[^'^ dz dARu X A r.2 X 



In (1 - - 4xj/L'2) 



e [1^74 - xj\ . (4.2) 



2xj 

Note we are integrating over the phase space of Fig. 2.6a, treating it as one-dimensional. 
The resulting distribution is exhibited in Fig. 4.2 for D = 1.0 where wc have mul- 
tiplied by a factor of xj to remove the explicit pole. We observe both the cutoff at 
xj — D^/4 arising from the kinematics discussed in Section 2.4.4 and the — ln{xj)/xj 
small-xj behavior arising from the singular soft/coUinear dynamics. Even if the in- 
frared singularity is regulated by virtual emissions and the distribution is resummed, 
we still expect QCD jet mass distributions (with fixed ptj) to be peaked at small 
mass values and be rapidly cutoff for mj > ptjD/2. 

We can improve this approximation somewhat by using the more quantitative 
perturbative analysis described in [23] . In perturbation theory jet masses appear at 
next-to-leading order (NLO) in the overall jet process where two (massless) partons 
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Figure 4.2: Distribution in xj for a simple LL toy model with D = 1.0. 



can be present in a single jet. Strictly, the jet mass is then being evaluated at leading 
order (i.e., the jet mass vanishes with only one parton in a jet) and one would prefer 
a NNLO result to understand scale dependence (we take n = ptj/2). Here we will 
simply use the available NLO tools [89j. This approach leads to the very similar 
xj distribution displayed in Fig. 4.3, plotted for two values of ptj (at the LHC, 
with ^/s = 14 TeV). We are correctly including the full NLO matrix element (not 




0.00 0.05 0.10 0.15 0.20 0.25 



Figure 4.3: NLO distribution in xj for kx-style QCD jets with D = 1.0, y/s = 14 
TeV, and two values of ptj- 



simply the singular parts), the full kinematics of the jet mass (not just the small-angle 
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approximation) and the effects of the parton distribution functions. In this case the 
distribution is normahzed by dividing by the Born jet cross section. Again we see 
the dominant impact of the soft/coUinear singularities for small jet masses. Note also 
that there is little residual dependence on the value of the jet momentum and that 
again the distribution essentially vanishes for xj > 0.25, itlj/ptj ^ 0.5 = D/2. The 
average jet mass suggested by these results is {mj/pTj) ~ 0.2D. Because the jet only 
contains two partons at NLO, we are still ignoring the effects of the nonzero subjet 
masses and the effects of the ordering of mergings imposed by the algorithm itself. 
For example, at this order there is no difference between the CA and kx algorithms. 

Next we consider the z and Ai?i2 distributions for the LL approximation where a 
single recombination of two (massless) partons is required to reconstruct as a jet of 
definite pxj and mass (fixed xj) . To that end we can "undo" one of the integrals in 
Eq. (4.2) and consider the distributions for z and /S.R12 ■ We find for the z distribution 
the form 

1 ^rrrr 1 r 1 - , / I - I /• , /ns] n 1 

(4.3) 



1 daLL 1 ^ 

7 7^ ^ ^ 

a dxjdz 2zxj 



1 - - IvjID^ 
2 



e 


"1 


z 




2 



As expected, we see the poles in z and xj from the soft/coUinear dynamics, but, as 
in Section 2.4.4 , the constraint of fixed xj yields a lower limit for z. Recall that the 
upper limit for z arises from its definition, again applied in the small-angle limit. Thus 
the LL QCD distribution in z is peaked at the lower limit but the characteristic turn- 
on point is fixed by the kinematics, requiring the branching at fixed xj to be in a jet 
of size D. This behavior is illustrated in Fig. 4.4 for various values of xj — 1/(7^ — 1) 
corresponding to those used in Section 2.4.4. 

The expression for the Ai?i2 dependence in the LL approximation is 

1 dULL 



a dxjdARi2 

2 e [ARu - 2^]Q[D - ARu] 



(4.4) 



^^?2 VAi??2 - ^xj (1 - Vl - ^xj/ARl,) ■ 
This distribution is illustrated in Fig. 4.5 for the same values of xj as in Fig. 4.4, 
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Figure 4.4: Distribution in z for LL QCD jets for D — l.Q and various values of xj. 
The curves are normalized to have unit area. 



As with the z distribution the kinematic constraint of being a jet with a definite xj 
yields a lower limit, Ai?i2 > 2y^, along with the expected upper limit, Ai?i2 < D. 
However, for Ai?i2 the change of variables also introduces an (integrable) square 
root singularity at the lower limit. This square root factor tends to be numerically 
more important than the 1/Ai?f2 factor.^ Since this square root singularity arises 
from the choice of variable (a kinematic effect), we will see that it is also present for 
heavy particle decays, suggesting that the Ai?i2 variable will not be as useful as z in 
distinguishing QCD jets from heavy particle decay jets. 

Thus, in our toy QCD model with a single recombination, leading-logarithm dy- 
namics and the small-angle jet mass definition, the constraints due to fixing xj tend 
to dominate the behavior of the z and Ai?i2 distributions, with limited dependence 
on the QCD dynamics and no distinction between the CA and kT algorithms. How- 
ever, this situation changes dramatically when we consider more realistic jets with 

^Onc factor of Ai?i2 arises from the eoUinear QCD dynamics while the other comes from 
change of variables. The soft QCD singularity is contained in the denominator factor 
{l - sj\ - 4xj/Ai?f2) 2^ for xj < Ai?2 (equivalently, 2; < 1). 
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Figure 4.5: Distribution in Ai?i2 for LL QCD jets for D — and various values of 
xj. The curves are normalized to have unit area. 



full showering. We will return to this subject after a brief interlude to consider the 
substructure of heavy particle decays. 



Jf..l.2 Substructure in heavy particle decays 

Recombination algorithms have the potential to reconstruct the decay of a heavy 
particle. Ideally, the substructure of a jet may be used to identify jets coming from a 
decay and reject the QCD background to those jets. In this section, we investigate a 
pair of unpolarized parton-level decays, a heavy particle decaying into two massless 
quarks (a 1 — )■ 2 decay) and a top quark decay into three massless quarks (a two- 
step decay). For each decay, we study the available phase space in terms of the lab 
frame variables AR12 and z and the shaping of kinematic distributions imposed by 
the requirement that the decay be reconstructed in a single jet. We will determine 
the kinematic regime where decays are reconstructed, and contrast this with the 
kinematics for a 1 — )■ 2 splitting in QCD. 
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1^2 Decays 

We begin by considering a 1 ^ 2 decay with massless daughters. An unpolarized 

decay has a simple phase space in terms of the rest frame variables cos6'o and (j)Q: 

£No _ 1 
d cos 6od(f)o 4:71 

Recall from Sec. 2.4.4 that cos 6*0 and 4>q are the polar and azimuthal angles of the 
heavier daughter particle in the parent particle rest frame relative to the direction of 
the boost to the lab frame. In general, we will use A^o to label the distribution of all 
decays, while N will label the distribution of decays reconstructed inside a single jet. 
Nq is normalized to unity, so that for any variable set $, 

d^ 

The distribution N is defined from Nq by selecting those decays that fit in a single 
jet, so that generically 

= J — $)9(single jet reconstruction). 

N is naturally normahzed to the total fraction of reconstructed decays. The con- 
straints of single jet reconstruction will depend on the decay and on the jet algorithm 
used, and abstractly take the form of a set of functions. For a 1 — > 2 decay and 
a recombination-type algorithm, the only constraint is that the daughters must be 
separated by an angle less than D: 

ARi2 < D. 

Since the kinematic limits imposed by reconstruction are sensitive to the boost 7 of 
the parent particle, we will want to consider the quantities of interest at a variety of 
7 values. To illustrate this 7 dependence, we first find the total fraction of all decays 
that are reconstructed in a single jet for a given value of the boost. We call this 
fraction /r(7): 

/d'^N 
dcose^d4>,-. ^0 {D - ARi2) . 
d cos Ood(po 
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In Fig. 4.6, we plot JrIj) vs. 7 for several values of D. The reconstruction fraction 




Figure 4.6: Reconstruction fractions /j?(7) as a function of 7 for various D. 



rises rapidly from no reconstruction to nearly complete reconstruction in a narrow 
range in 7. This indicates that Ait!i2 is strongly dependent on 7 for fixed cos ^0 and 
00, which we will see below. Conversely, the minimum boost necessary for a decay 
to fit in a jet depends strongly on D. The turn-on for increasing 7 is the same effect 
as the {z, AR12) phase space moving into the allowed region below AR12 = D in 
Fig. 2.6a as xj is reduced. 

To better understand the effect that reconstruction has on the phase space for 
decays, we would like to find the distribution of 1 — > 2 decays in terms of lab frame 
variables. 



dzdARi2 

With two massless daughters, Ai?i2 is given in terms of rest frame variables by 

2 



12 



+ 



tanh 



tan' 



-1 



27 sin ^0 sin 1 



sin^ eo{l3^-f^ + sin^ (/)o) + 1 

2/^7 sin ^0 cos (J)q 
sin^ ^ol/^V + sin^ 0o) - 



(4.5) 



with /3 = -^1 — 7 ^. This relation is analytically non-invertible, meaning we cannot 
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write the Jacobian for the transformation 



d cos Oodcpo dzdARi2 
in closed form. However, Ai?i2 has some simple limits. In particular, when the boost 
7 is large, to leading order in 7"^, 

Ai?i2 = + o {r') ■ 

7 sm 6*0 

This hmit is only valid for sin^o ^ 7~^) but as we will see this is the region of 
phase space where the decay will be reconstructed in a single jet. The large-boost 
approximation describes the key fcatm^cs of the kinematics and is useful for a simple 
picture of kinematic distributions when particles are reconstructed in a single jet. 

Since 7 = ^/l + 1/xj, this limit is equivalent to the small- angle limit we took in 
Sec. 4.1.1, (For Ai?^ < 1, fa z{l - z)AR^ < 1.) We can see this in Eq. (4.5), 
where AR ^ I/7. 

The value of z is also simple in the large-boost approximation. In this limit, 

1 — IcOS^ol / 9\ 

z^ ^-^ + C?(7-'). 

With the large-boost approximation, z and Ait!i2 are both independent of (po- As 
noted earlier both Ai?i2 and z depend on 0o only through terms that are suppressed 
by inverse powers of 7 (cf. Figs. 2.5 and 2.6). In this limit we can integrate out and 
find the distributions in z and Ai?i2 for all decays. For z the distribution is simply 
flat: 

5^ . 20 (i - .) e(.). (4.6) 

We have included the limits for clarity. For Ai?i2, the distribution is 

dNo ^ 4 e(Ai?i2-27-^) 
dAR,2 ^ l^ARl^ ^ARl, - 47-2 ■ ^ ■ ^ 

This distribution has a lower cutoff requiring Ai?i2 > 27"^. This is close to the true 
lower limit on Ai?i2, Ai?i2 > 2 csc^^ 7. Note that in Eq. (4.7), there is a enhancement 
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at the lower cutoff in Ai?i2 due to the square root singularity arising from the change 
of variables, just as there was in the QCD result in Eq. (4.5). 

In Fig. 4.7, we plot the exact distribution dN^/dz, found numerically, for several 
values of 7. The true distribution is qualitatively similar to the approximate one 

3.5 
3.0 
2.5 

dNo 

dz 1.5 

1.0 
0.5 

"'8.0 0.1 0.2 OJ 0.4 0.5 

z 

Figure 4.7: The distribution of all decays in z for several values of 7. 

in Eq. (4.6), which is flat. The peak in the distribution at small z values comes 
from the reduced phase space as 2; — > 0, and the peak is lower for larger boosts. 
In Fig. 4.8, we plot the exact distribution dNQ/dARi2, which is again qualitatively 
similar to the large-boost result. The distribution in Ai?i2 is localized at the lower 
limit, especially for larger boosts. This provides a useful rule: the opening angle of a 
decay is strongly correlated with the transverse boost of the parent particle. Note that 
the relevant boost is the transverse one because the angular measure AR is invariant 
under longitudinal boosts (recall that in the example here, we have set the parent 
particle to be transverse). 

The constraint imposed by reconstruction is simple in the large-boost approxima- 
tion. In terms of sin^o; the constraint Ai?i2 < D requires sin^o > 2/7D, which 
excludes the region where the approximation breaks down. Therefore the large- 
boost approximation is apt for describing the kinematics of a reconstructed decay. 
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Figure 4.8: The distribution of all decays in Ai?i2 for several values of 7. 



In Fig. 4.9, we plot the distribution, dN /d cos 9o, where the implied sharp cutoff is 
apparent (and should be compared to what we observed in Fig. 2.5a). This distribu- 




Figure 4.9: The reconstructed distribution dN/d cos 9q with D = 1.0 for various values 
of 7. 



tion is easy to understand in the rest frame of the decay. When | cos^ol is close to 1, 
one of the daughters is nearly coUinear with the direction of the boost to the lab frame, 
and the other is nearly anti-coUinear. The anti-coUinear daughter is not sufficiently 
boosted to have Ai?i2 < D with the coUinear daughter, and the parent particle is 

not reconstructed. As | cos6'o| decreases, the two daughters can be recombincd in the 
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same jet; this transition is rapid because the 0o dependence of the kinematics is small. 
We now look at the distributions of z and Ai?i2 when we require reconstruction. 

Because z is linearly related to cos^o large boosts, the distribution in z has a 
simple form: 

dN ^„ I 1 
dz 



20 z 



(4.8) 



Comparing to Eq. (4.6), we see that requiring reconstruction simply cuts out the 
region of phase space at small z. This is confirmed in the exact distribution dN/dz, 
shown in Fig. 4.10, The small-z decays that are not reconstructed come from the 



dN 



1.0 




0.1 



0.2 



0.3 



0.4 



0.5 



Figure 4.10: The distribution of reconstructed decays in z for several values of 7. 



regions of phase space with | cos ^o| near 1, just as in the previous discussion. In these 
decays, the backwards-going (anti-coUinear) daughter is boosted to have small px in 
the lab frame. Comparing to Fig. 4.4, the distribution in z for QCD splittings, we see 
first that the cutoffs on the distributions are similar (they are not identical because 
of the LL approximation used in Fig. 4.4). However, the QCD distribution has an 
enhancement at small z values, due to the QCD soft singularity, that the distribution 
for reconstructed decays does not exhibit. 

The distribution of reconstructed particles in the variable AR12 is related simply 
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to the distribution of all decays in the same variable: 



dN 



dNo 



e{D- ARi2) , 



(4.9) 



dAR 



■12 



dAR 



■12 



which means that the distribution dN/dARi2 is given by Fig. 4.8 with a cutoff at 
Ai?i2 = D. Note that this distribution is very close in shape to the distribution of 
QCD branchings versus AR12 displayed in Eq. (4.5) and Fig. 4.5, This similarity 
arises from that the fact that the most important factor in the shape is the square 
root singularity, which arises from the change of variables in both cases and hides the 
underlying differences in dynamics. 

Two-step Decays 

We now turn our attention to two-step decays, which exhibit a more complex sub- 
structure. Two-step decays offer new insights into the ordering effects of the kx and 
CA algorithms, highlight the shaping effects from the algorithm on the jet substruc- 
ture, and offer a surrogate for the cascade decays that are often featured in new 
physics scenarios. Even at the parton level the choice of jet algorithm matters in re- 
constructing a multi-step decay; different algorithms can give different substructure. 
In studying this substructure we take the same approach as for the 1 — )■ 2 decay, 
translating the simple kinematics of a parton-level decay into the lab frame variables 
Ait!i2 and z. 

The top quark is a good example of a two-step decay, and we focus on it in this 
section. We will label the top quark decay t Wb, with W ^ qq'. In this discussion 
requiring that the top quark be reconstructed means that the W must be recombined 
from q and q' first, then merged with the b. The observed (3-parton) "jet" will then 
have the W as one of its daughter subjets. 

For the kx algorithm, reconstructing the top quark in a single jet imposes the 



81 



following constraints on the partons: 

mm{pTq,PTq')^Rqq' < mm{pTq, PTb) ^Rbq, 
mm{pTq,PTq')^Rqq' < T^^T^iPTq' , PTb)^Rbq' , 

A.Rqq/ < D, and 
ARbw < D. 

For the CA algorithm the relations are strictly in terms of the angle: 

ARgqi < ARiiq, 
ARqqi < ARhqr, 

ARqqf < D, and 
AR,,w < D- 

The kinematic limits requiring the decay to be reconstructed in a single jet are the 
same for the two algorithms, but fixing the ordering of the two recombinations requires 
a different restriction for each algorithm, which in turn biases the distributions of 
kinematic variables. 

The common requirements such that the top quark be reconstructed in a single 
jet, ARqqi < D and ARwb < D, are straightforward to understand in terms of the 
rest frame variable cos^^oi which here is the polar angle in the top quark rest frame 
between the W and the boost direction to the lab frame. For cos ~ 1) the W has a 
large transverse boost in the lab frame, so ARqqi < D, but the angle between the W 
and h will be large (as was the case for the corresponding 1 — > 2 decay in the previous 
section). For cos^o ~ the W transverse boost is small, and ARqqi will be large. 
Therefore, we only expect to reconstruct top quarks in a single jet when |cos^o| is 
not near 1. 

If the CA algorithm correctly reconstructs the top quark, the two quarks from 
the W decay must be the closest pair (in AR) of the three final state particles. 
This requirement strongly selects for decays where the W opening angle, ARqqi, is 
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smaller than the top quark opening angle, ^Rwb- Therefore, only decays with a 
large (transverse) W boost will be reconstructed by the CA algorithm. In terms of 
cos^o, the fraction of decays that are reconstructed will increase as we increase cos 6^0 
towards the upper limit where ^Rwb > -D, and the reconstruction fraction will be 
small for lower values of cos^o- 

The kx algorithm orders recombinations by px as well as angle, and the set of 
reconstructed decays is understood most easily by contrasting with CA. As the trans- 
verse boost of the W decreases, on average the pt of the q and q' decrease while the pt 
of the h increases. Therefore, while ARqqi is increasing, mm{pTq,PTq') is decreasing, 
and these competing effects suggest that kx reconstructs decays with smaller values 
of cos 6q than CA, and that the dependence on cos 6q is not as strong. 

The effect of the CA and kx algorithms on the observed distribution in cos^o 
is shown in Fig. 4.11, where we plot the distribution of cos^o for reconstructed top 
quarks for both algorithms. The top boost is fixed to 7 = 3. We observe the kinematic 




cos 6(1 

Figure 4.11: dN/d cosOq vs. cos^o? with 7 = 3, for both the kx and CA algorithms. 
The underlying distribution dNo/dcosOo = 1/2 is plotted as the dotted hne for refer- 
ence. 

limit near cos^o ~ 0.8 is common between algorithms, and that cos^^o ~ — 1 is not 
accessed by either algorithm. As expected, the distribution for the CA algorithm falls 
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off more sharply than for kx at lower values of cos^o- 

Next, we look at distributions in z and I\Rwh- Just as in the 1 — )■ 2 decay, we 
expect decays with small z not to be correctly reconstructed. Small values of z will 
come when the or 6 is soft, and therefore produced very backwards-going in the 
top rest frame. This corresponds to cos^o ~ il? and from Fig. 4.11 these decays are 
not reconstructed. In Fig. 4.12, we plot the distribution in z for all decays, dN^/dz, 
and the distribution for reconstructed decays, dN/dz, for a boost of 7 = 3. 




z 



Figure 4.12: dN^/dz (all decays) and dN/dz (reconstructed decays), with 7 = 3. 



In dNo/dz, the discontinuity at 2; ^ 0.2 arises from the fact that the W is some- 
times softer than the b, but has a minimum p^. The extra weight in dN^/dz for z 
above this value comes from the decays where the W is softer than the h. Note that 
these decays are rarely reconstructed, especially for CA: the distribution dN/dz is 
smooth, and has little additional support in the region where the W is softer. This 
correlates with the fact that decays with negative cos 9q values are rarely reconstructed 
with CA, but more frequently with kx- The distribution dN/dz has a lower cutoff 
that corresponds to the upper cutoff in Fig. 4.11, As the boost 7 of the top increases, 
the cutoff at small z decreases, since the limit in cos^^o foi' which Ai?vyb > D will 
increase towards 1. 
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The opening angle ^Rwb of the top quark decay also illustrates how strongly 
the kinematics are shaped by the jet algorithm. When cos^^o ~ for sufficient 
boosts ^Rwb is small because the W is boosted forward in the lab frame, but these 
decays are not reconstructed because the ordering of recombinations will typically be 
incorrect and the W decay may not have ARgg' < D. For cos^^o ~ 1, ^Rwb will 
exceed D and the top will not be reconstructed. In Fig. 4.13, we plot the distribution 
dNo/dARwh of the angle between the W and b in all top decays for a top boost of 
7 = 3, as well as the distribution dN/dARi2 of the angle of the last recombination 
for reconstructed top quarks with the kx and CA algorithms. Note that when the 
top quark is reconstructed at the parton level, Ai?i2 = ARwt- The difference in 
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Figure 4.13: dNo/dARwh (all decays) and dN/dARi2 (reconstructed decays), with 
7 = 3. 



dN / dARi2 between the kx and CA algorithms reflects their different recombination 
orderings. Because CA orders strictly by angle, the angle AR12 tends to be larger 
than for kx because CA requires AR12 = AR^h > ARqqi. 

Contrast with QCD 

Contrasting the figures in the previous two subsections, we can see that at this level of 
approximation, QCD splittings and heavy particle decays have distinct kinematics. In 
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both cases the kinematical requirements of fixed mass and pt lead to cutoffs in phase 
space (recall Fig. 2.6), but within these boundaries the differing dynamics shape the 
distributions. For example, QCD splittings tend to have small z (Fig. 4.4), driven by 
the soft singularity of the QCD splitting function. A one-step decay (Fig. 4.10) has 
a completely flat distribution in z, whereas a two-step decay (Fig. 4.12) has a more 
complicated shape once we require accurate reconstruction. In the case of Ai?i2, the 
differences are less dramatic. In both cases kinematics drive Ait!i2 to be as small as 
possible (compare Figs. 4.5 and 4.8). The two-step decay is more comphcated, but 
the distributions (Fig. 4.13) still have a peak at low values. 

If we wish to jets representing heavy particle decays from their QCD background, 
after cutting on a jet mass we would presumably be interested in jet substructure. 
We have seen that at the parton level, after fixing jet masses, the distributions in z 
are still distinct enough to expect some additional discrimination. Ait!i2, on the other 
hand, does not appear useful. To see if these kinematic differences can be exploited, 
we must first study how they appear in real jets, where we include the effects of 
showering and subsequent reconstruction. 

4.2 Algorithm systematics in e+e" events 

To obtain a more realistic understanding of jet substructure we must turn to simulated 
events.^ Monte Carlo event generators replace our simple models with exact matrix 
element calculations, supplemented with parton shower algorithms that model the 
behavior of QCD showering. Such generators produce events consisting of hundreds 
of outgoing hadrons, which we analyze via a jet algorithm. If, after finding jets, we 
consider their final merging step (from the shower perspective, their first splitting), 
we might expect this branching to resemble the splittings of the previous section's 
models. This would be the case if the jet algorithm could precisely undo the parton 

^Somc of the discussion in this section and the next one are taken from Sections III-V of [2,. All 
of the figures, except Fig. 4.23, are new. 
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shower, but of course this can never be true. In this section we will consider how the 
mirror processes of the QCD shower and the jet algorithm shape jet substructure. 

We begin by considering top-antitop and dijet events in e'^e" collisions, since this 
excludes a variety of other effects we wish to postpone having to think about. For 
both samples we consider events with a center-of-mass energy of 1200 GeV. We will 
keep (incongruously) hadron collider language and analysis, since we're only using 
e^e~ collisions as a proxy for "clean" events with no strongly interacting particles in 
the initial state. The details of the event generation are given in Appendix C, 

4.2.1 QCD jets 

We first consider simulated QCD jets. As suggested earlier, we anticipate two im- 
portant changes from the previous discussion. First, the showering ensures that the 
daughter subjets at the last recombination have nonzero masses. More importantly 
and as noted in Section 2.4.4, the sequence of recombinations generated by the jet al- 
gorithm tends to force the final recombination into a particular region of phase space 
that depends on the recombination metric of the algorithm. For the CA algorithm 
this means that the final recombination will tend to have a value of Ait!i2 near the 
limit D, while the kx algorithm will have a large value of zARi2Ptj- This issue will 
play an important role in explaining the observed z and Ai?i2 distributions. 

First, consider the jet mass distributions from the simulated event samples. In 
Fig. 4.14, we plot the jet mass distributions for the and CA algorithms for all 
jets in the sample. As expected, for both algorithms the QCD jet mass distribution 
smoothly falls from a peak only slightly displaced from zero (the remnant of the per- 
turbative — ln(m^)/m^ behavior). There is a more rapid cutoff for mj > ptjD/2, 
which corresponds to the expected kinematic cutoff from the LL approximation, but 
smeared by the spread in px, the nonzero sub jet masses and the other small correc- 
tions to the LL approximation. The average jet mass, (rrij) 100 GeV, is in crude 
agreement with the perturbative expectation (mj/pTj) ~ 0.2. Note that in these 
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Figure 4.14: Distribution in mj for QCD jets in e+e — )■ qq events, with D = 1.0. 

events the two algorithms give nearly identical distributions. 

Other details of the QCD jet substructure are substantially more sensitive to the 
specific algorithm than the jet mass distribution. To illustrate this point we will 
discuss the distributions of z, AR12, and the heavier subjet mass for the last recom- 
bination in the jet. We can understand the observed behavior by combining a simple 
picture of the geometry of the jet with the constraints induced on the phase space 
for a recombination from the jet algorithm. In particular, recall that the ordering 
of recombinations defined by the jet algorithm imposes relevant boundaries on the 
phase space available to the late recombinations (see Fig. 2.7). 

While the details of how the kx and CA algorithms recombine protojets within a 
jet are different, the overall structure of a large-pT jet is set by the shower dynamics 
of QCD, i.e., the dominance of soft /collinear emissions. Typically the jet has one (or 
a few) hard core(s), where a hard core is a localized region in y-(j) with large energy 
deposition. The core is surrounded by regions with substantially smaller energy de- 
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positions arising from the radiation emitted by the energetic particles in the core (i.e., 
the shower), which tend to dominate the area of the jet. In particular, the periph- 
ery of the jet is occupied primarily by the particles from soft radiation, since even 
a wide-angle hard parton will radiate soft gluons in its vicinity. This simple picture 
leads to very different recombinations with the kx and CA algorithms, especially the 
last recombinations. 

The CA algorithm orders recombinations only by angle and ignores the pr of the 
protojets. This implies that the protojets still available for the last recombination 
steps are those at large angle with respect to the core of the jet. Because the core of 
the jet carries large pt, as the recombinations proceed the directions of the protojets in 
the core do not change significantly. Until the final steps, the recombinations involving 
the soft, peripheral protojets tend to occur only locally in y-0 and do not involve the 
large-pT protojets in the core of the jet. Therefore, the last recombinations defined 
by the CA algorithm are expected to involve two very different protojets. Typically 
one has large Pt, carrying most of the four-momentum of the jet, while the other 
has small pt and is located at the periphery of the jet. The last recombination will 
tend to exhibit large Ai?i2, small large ai (near 1), and small 02, where the last 
two points follow from the small z and correspond to the (^, Ai?i2) phase space of 
Fig. 2.6c, 

In contrast, the kx algorithm orders recombinations according to both pt and 
angle. Thus the kx algorithm tends to recombine the soft protojets on the periphery of 
the jet earlier than with the CA algorithm. At the same time, the reduced dependence 
on the angle in the recombination metric implies the angle between protojets for the 
final recombinations will be lower for kx than CA. While there is still a tendency for 
the last recombination in the kx algorithm to involve a soft protojet with the core 
protojet, the soft protojet tends to be not as soft as with the CA algorithm (i.e., the z 
value is larger), while the angular separation is smaller. Since this final soft protojet 
in the kx algorithm has participated in more previous recombinations than in the CA 
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case, we expect the average 02 value to be further from zero and the Oi value to be 
further from 1. Generally the (z, Ai?i2) phase space for the final kx recombination 
is expected to be more like that illustrated in Figs. 2.6b and 2.6d (coupled with the 
boundary in Fig. 2.7b). 

To illustrate this discussion we have plotted distributions of 2;, AR12, and ai for 
the last recombination in a jet for the kx and CA algorithms in Fig. 4.15. We plot 
distributions with and without a cut on the jet mass, where the cut is a narrow 
window (f« 15 GeV) around the top quark mass. This cut selects heavy QCD jets: 
for the jets in this sample, with Pt between 500-600 GeV, it corresponds to a cut 
on xj of 0.06-0.09. These distributions reflect the combined influence of the QCD 
shower dynamics, the restricted kinematics from being in a jet, and the algorithm- 
dependent ordering effects discussed above. Most importantly, note the very strong 
enhancement at the smallest values of z for the CA algorithm in Fig. 4.15a, which 
persists even after the heavy jet mass cut. Note the log scale in Fig. 4.15a! While 
the kx result in Fig. 4.15b is still peaked near zero when summed over all jet masses, 
the enhancement is not nearly as strong. After the heavy jet mass cut is applied, 
the distribution shifts to larger values of z, with an enhancement remaining at small 
values. Only in this last plot is there evidence of the lower limit on z of order 0.1 
expected from the earlier LL approximation results. 

Fig. 4.15c illustrates the expected enhancement near AR12 — D — 1.0 for CA. 
Fig. 4.15d shows that kx exhibits a much broader distribution than CA with an 
enhancement for small AR12 values. Once the heavy jet mass cut is applied, both 
algorithms exhibit the lower kinematic cutoff on AR12 suggested in the LL approxima- 
tion results, as both distributions shift to larger values of the angle. This shift serves 
to enhance the CA peak at the upper limit and moves the lower end enhancement in 
kx to substantially larger values of AR12. 

The CA algorithm bias toward large ai is demonstrated in Fig. 4.15e. We can 
see that requiring a heavy jet enhances the large-ai peak. The kx distribution in 
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Figure 4.15: Distribution in 2;, Ai?i2, and the scaled (heavier) daughter mass Oi for 
QCD jets in e~^e~ — )■ qq events, using the CA and kx algorithms, with (red) and 
without (blue) a cut around the top quark mass. D = 1.0. 
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Oi, shown in Fig. 4.15f, exhibits a broad enhancement around ai ^ 0.4. This dis- 
tribution is relatively unchanged after the jet mass cut. To give some insight into 
the correlations between z and AR12, in Fig. 4.16 we plot the distribution of both 
variables simultaneously for both algorithms, with no jet mass cut applied. The very 




Figure 4.16: Combined distribution in z and AR12 for QCD jets in e+e — > qq events, 
using the CA (left) and kx (right) algorithms, for jets with pt > 500 and D = 1.0. 



strong enhancement at small z and large Ai?i2 for CA is evident in this plot. For kx, 
there is still an enhancement at small z, but there is support over the whole range 
in z and AR12 with the impact of the shaping due to the z x AR12 dependence in 
the metric clearly evident. Note that the kx distribution is closer to what one would 
expect from QCD alone, with enhancements at both small z and small AR12, while 
the CA distribution is asymmetrically shaped away from the QCD-like result. Finally 
we should recall, as indicated by Fig. 4.14, that the jets found by the two algorithms 
tend to be slightly different, with the kx algorithm recombining slightly more of the 
original (typically soft) protojets at the periphery and leading to slightly larger jet 
masses. 

Because the QCD shower is present in all jets, and is responsible for the complexity 
in the jet substructure, the systematic effects discussed above will be present in all 
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jets. While the kinematics of a heavy particle decay is distinct from QCD in certain 
respects, we will find in the next subsection that these effects still present themselves 
in jets containing the decay of a heavy particle. 

4-2.2 Jets from heavy particle decays 

For an example of a heavy particle decay, we know consider the systematic effects of 
showering and the jet algorithm on top quark jets. We consider e+e" — > tt events 
with Q = 1200 GeV, so each top quark will have E = 600 GeV, and will tend to 
appear within a single jet (we use D = 1.0). Details of the event generation are 
given in Appendix C, Note that even in the relatively clean e+e" tt events, the 
top quarks are not themselves color singlets, so hadronization connects the jets — 
fortunately this is a small effect since the top quarks' energy and mass are both much 
larger than the hadronization scale. After reconstructing "top jets'', we expect that 
the kinematics of the last few mergings/splittings will differ in important ways from 
our parton-level predictions. For instance, with the CA algorithm we expect that soft 
recombinations will occur at the last recombination step, even for jets that contain 
the decay products of a top quark. This can make the substructure look more like a 
heavy QCD jet than a top quark decay, and subsequently the jet may not be properly 
identified. 

To demonstrate this point, in Fig. 4.17 we plot the distribution in z for jets with 
mass within a window around the top quark mass. The distribution for CA jets is very 
different from the parton-level distribution (Fig. 4.12). The excess at small values of 
z arises from soft recombinations in the CA algorithm, which make the distribution 
similar to that for QCD jets (Figs. 4.15a and 4.15b). For the kx algorithm, there are 
rarely soft recombinations late in the algorithm, because the metric orders according 
to z as well as AR. 

In these relatively clean events, the kx and CA algorithms find very nearly the 
same jets. This can be seen in Fig. 4.18, where we plot the jet mass distribution for 
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Figure 4.17: Distribution in z for jets with the top mass in e+e — > tt events. D = 
1.0. 

both algorithms. Thus the effects we have seen stem from different ordering in the 
algorithms, not differences in the particles that get included in the jet. We will see 
in the next section that differences in what is included in the jet play a bigger role at 
hadron colliders. 

4.3 Event effects on jet substructure in liadron collisions 

At a hadron collider like the LHC, there are additional systematic effects on jet 
substructure. We need to account for the combined effect of splash-in from several 
sources: initial state radiation (ISR, the radiation from the incoming partons in the 
hard scattering), the underlying event (UE, the rest of the pp interaction), and pile-up 
(other pp collisions that occur in the same time bin). All of these sources add particles 
to jets that are typically soft and approximately uncorrelated. Splash-in particles will 
mostly be located at large angle to the jet core, simply because there is more area 
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Figure 4.18: Distribution in jet mass for jets in the neighborhood of the top mass in 
e~^e~ — )■ tt events for the CA (black) and kx (red) algorithms. D = 1.0. 



there. How these particles affect jet substructure depends on the algorithm used. We 
expect them to contribute similarly to soft radiation from the QCD shower, discussed 
in the previous section. In this section we will consider the effects of adding ISR and 
UE. We expect the effects of pile-up will be of a similar nature, although possibly of 
a much greater magnitude depending on the collider luminosity. 

We should note that such a clean separation of different effects is artificial.^ 
Whether outgoing gluons were radiated from initial- or final-state partons is not 
quantum-mechanically meaningful, so the amplitudes for initial- and final-state ra- 
diation must interfere. The same is true for the underlying event. In addition to 
interference at the perturbative level, hadronization in general will, and often must, 
link these different processes together. The particles seen by the detector are of course 
color singlets, so quarks and gluons in the "final state" must connect with each other 



"^Except in the case of pile-up, where the separation is perfectly well defined. 
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or the rest of the event to form hadrons. This makes the question of whether a 
hadron befongs to final-state radiation, initial-state radiation, or underlying event 
inherently ambiguous. In this section as we progressively include more of the event 
in our Monte Carlo samples, we should think of this as building a progressively more 
realistic model of QCD into the Monte Carlo, not as simply adding another source of 
final-state hadrons. 

4.3.1 Mass effects 

As we consider jets in hadron collisions, the natural place to begin is jet masses. In 
Fig. 4.19 we plot the mass distribution for jets in tt events with the CA and kT algo- 
rithms. In each plot we show the distribution for three kinds of Monte Carlo samples 
from Pythia: events where we only include radiation from final-state partons, events 
including initial-state radiation, and events including both initial-state radiation and 
underlying event activity. The precise details of these samples are given in Appendix 
C, 

As we include more of the full event's activity, the jet mass distribution is broad- 
ened, with a peak that shifts upward. We can contrast these results with Fig. 4.18 
where we found the equivalent distribution for e'^e~ — )■ tt events. In that case, the 
jet mass distribution had a clean upper bound at the top mass, with a tail for lower 
masses. This has a simple interpretation: in e+e" tt events, essentially all final- 
state hadrons come from one top decay or the other, and for high-Q^ events these are 
well separated. We reconstruct jets that can encompass the entire top quark decay, 
but there is nothing else to pick up so the mass distribution cuts off at rrit. Some 
amount of radiation will in general be emitted outside the jet radius, leading to the 
tail at lower masses. 

This lower tail shows up again in pp events, but now a high-mass tail is present as 
well. For the FSR sample, the tail is shghtly larger than for e+e~ events: even without 
full ISR and UE simulation, Pythia must arrange color connections to produce 
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Figure 4.19: Distribution in mj for ttbar jets in pp — )■ tt events, using the CA and 
kx algorithms, with only FSR (blue), including ISR (red), and including ISR and UE 
(green). Jets have px > 500 GeV and D = 1.0. 
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outgoing hadrons and deal with the beam remnants, so even these events are not as 
clean as in e+e~ collisions. Note that whereas the process e+e~ — >■ tt always occurs 
through an electroweak boson, at a pp collider the dominant process is gluon fusion 
{99 9* ^ tt), so the final state is not a color singlet. 

As we include the full effects of ISR and UE, there is more radiation in the event 
that can be included in the top quark jets. This naturally leads to broader mass 
distributions and a higher mass peak. We can see that adding UE has a much bigger 
effect than adding ISR. In understanding Monte Carlo simulations as well as data, 
UE will be the more important consideration. 

In e^e~ collisions, the CA and kT algorithms found essentially the same jets 
despite their different substructure ordering. We can see by comparing the upper 
and lower figures in Fig. 4.19 that this is not the case in pp events, kx jets, while 
similar to CA jets in the FSR sample, are substantially more susceptible to the mass 
broadening induced by the addition of ISR and UE. This effect is a manifestation of 
the kT algorithm's larger and more irregular "jet area" [90j. 

In Fig 4.20, we show the analogous plots for jet masses in QCD multijet events. 
Broadly, the same effects are visible as in tt events. Adding ISR and UE shifts the jet 
mass distribution upward, significantly increasing the number of jets falling inside the 
top quark's mass window. The kx algorithm is again more susceptible to the extra 
radiation although the effect is less pronounced. 

If our goal is to search for top quarks by looking for jets with a mass near rrit, 
we can see that the hadronic environment has two pernicious effects. First, the mass 
distribution for top quark jets — the signal distribution — is broadened and has a 
lower peak. Second, the multijet mass distribution — the background distribution — 
is shifted upwards so that the number of background events is larger in the region we're 
interested in. For the reasons discussed at the beginning of this section, completely 
removing the effects of ISR and UE is not possible even in principle. But to the extent 
we can remove them we will improve our ability to identify heavy particles in jets. 
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Figure 4.20: Distribution in mj for QCD jets in matched pp — )■ jets events, using the 
CA and kx algorithms, with only FSR (blue), including ISR (red), and including ISR 
and UE (green). Jets have pt > 500 GeV and D = 1.0. 
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4-3.2 Substructure effects 

To further explore the effects of the hadronic environment on heavy particle jets, we 
now turn to jet substructure. An understanding of how event effects appear in jet 
substructure will help us see how to mitigate them. 

In Fig. 4.21 we plot the substructure variables z, Ai?i2, and ai for pp — )■ tt events, 
the same three samples as in the previous subsection. For CA jets the changes are 
clearest in the Ai?i2 distribution (Fig. 4.21c). ISR and UE push upward the final 
angle of recombination Ait!i2. The CA algorithm recombines protojets in order of 
angular separation, so the final mergings already tend to be at large angles. To push 
the final angle even larger, ISR and UE must be adding radiation at the periphery 
of the jet which can be merged in late in the jet algorithm. In Fig. 4.21d we can see 
that this effect does not occur in kx jets, kx orders by px as well as angle, so soft 
radiation at the periphery is merged into the jet early on, leaving as the final merging 
the combination of moderately-separated hard protojets — perhaps representing the 
W and b in the case of a top quark jet. We can conclude that the effects seen in the 
CA distributions are due to soft, large-angle radiation, and not to a more fundamental 
shift in the hard subjet dynamics, because this would show up in the kx distribution. 

Moreover, if we consider the distributions in ai, the scaled heavier subjet mass, 
CA jets are pushed more toward ai = 1. This corresponds to one subjet having the 
same mass as the jet, with the other subjet having close to zero: the heavier subjet 
should presumably be associated with the top quark whereas the light subjet is likely 
to be soft radiation. This radiation is quite possibly from ISR or UE, but in any case 
is not contributing significantly to the jet mass. 

For kx there is no obvious systematic effect on z or Ai?i2, but we see that the 
distribution in ai, peaked at rriyj/mt is broadened just like the jet mass distribution. 

In Fig. 4.22 we show the same plots for the QCD multijet samples. We can again 
see that ISR and UE add additional soft radiation at large angle, pushing up the 
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Figure 4.21: Distribution in 2;, Ai?i2, and the scaled (heavier) daughter mass ai for 
ttbar jets in pp — )■ ti events, using the CA and kx algorithms, with only FSR (blue), 
including ISR (red), and including ISR and UE (green). Jets have px > 500 GeV and 
D = 1.0. 



101 



distribution in Ai?i2. This is true even for kx jets: for QCD events kx jets have a 
smaller typical opening angle than tt events so the scope for contamination from ISR 
and UE is greater. Other than the shift in Ai?i2 the substructure variables are not 
strongly affected. 

The substructure distributions for signal and background suggest that a large part 
of the effect of ISR and UE consists of the addition of soft, large-angle radiation. The 
effects are less pronounced for kx jets, especially in the signal sample, but kx jets have 
their own disadvantage. Whereas CA jets tend to have ISR and UE radiation included 
toward the end of the algorithm, shifting the kinematics of the final substructure, kx 
jets include more extra radiation earlier. This can be seen in the mass distributions 
in Figs. 4.19 and 4.20, The distortions in CA substructure are in fact an advantage: 
they will give us a tool for removing (some of) the contributions of ISR and UE. 

4.4 Summary 

We have seen numerous examples that the kinematics of the jet substructure in the 
last recombination for CA is a poor indicator for the physics of the jet. However, we 
can characterize the aberrant substructure very simply. For the CA algorithm, late 
recombinations (necessarily at large AR) with small z are more likely to arise from 
systcmatics effects of the algorithm than from the dynamics of the underlying physics 
in the jet. For the kx algorithm, the poor mass resolution of the jet arises from earlier 
recombinations of soft protojets. The last recombination for kx is representative of 
the physics of the jet, but the degraded mass resolution makes it difficult to efficiently 
discriminate between jets reconstructing heavy particle decays and QCD. While small- 
z, large-Ai? recombinations arc not as frequent late in the kx algorithm as in CA, 
they do contribute the most to the poor mass resolution of kx. 

As a simple example of the sensitivity of the mass to small- 2;, large- Ai? recombina- 
tions, consider the recombination i,j — > p of two massless objects in the small- angle 
approximation. The mass of the parent p is given by = p"^ z{l — z)ARfj, as in 
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Figure 4.22: Distribution in z, AR12, and the scaled (heavier) daughter mass Oi 
for QCD jets in matched pp — )■ jets events, using the CA and kx algorithms, with 
only FSR (blue), including ISR (red), and including ISR and UE (green). Jets have 
Pt > 500 GeV and D = 1.0. 
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Eq. (2.41). Suppose the value of the recombination metric, pjj(kT) — pTpZARi2 is 
bounded below by a value po (say by previous recombinations) , and the recombination 
hj ~^ P occurs at pij(kT) = po- Then the mass of the parent is rri^ — Pq{1 — z)/z, 
which is maximized for small z. Therefore, at a given stage of the algorithm, small- 2; 
recombinations have a large effect on the mass of the jet. 

When we can resolve the mass scales of a decay in a jet, the distribution of 
kinematic variables matches closely what we expect from the parton-level kinematics 
of the decay. For the example of the top quark decay, if we select jets with the top 
mass that have a daughter subjet with the W mass, the kinematic distributions of z 
and Ai?i2 closely match the distributions from the parton-level decay of the top quark. 
We show this in Fig. 4.23, where we make a top quark "hadron-parton" comparison 
for z and Ai?i2. The specifics of the mass cuts are described in Sec. 5.4. In the 
parton-level events, we simply require that the top quark decay to three partons be 
fully reconstructed by the algorithm in a single jet, namely that the W is correctly 
recombined first from its decay products before recombination with the h quark to 
make the top. The parton-level events have the same distribution of top quark boosts 
as the top jets in the hadron-level events. It is clear that simply requiring the hadron- 
level jet to have the top mass, which makes no cut on the substructure, leads to 
kinematic distributions in z and Ai?i2 for CA that do not match the parton-level 
distributions, although the distributions do match quite well for the kx algorithm. 
The excess of small- 2; recombinations for CA in the hadron-level jet with only a jet 
mass cut arises from jet algorithm effects discussed previously. After the subjet mass 
cut, these are removed and the distribution of z in the jet matches the reconstructed 
parton-level decay very well. 

Therefore, when we can accurately reconstruct the mass scales of a decay in a jet, 
the kinematics of the jet substructure tend to reproduce the parton-level kinematics 
of the decay. This suggests that if we can reduce systematic effects that generate 
misleading substructure, we can improve heavy particle identification and separation 
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Figure 4.23: Distributions in z and AR12 comparing for top quark decays at the 
parton level and from Monte Carlo events. The jets have pt between 500 and 700 
GeV, and have D = 1.0. The parton- level top decays have the same distribution of 
boosts as the Monte Carlo top jets. Jets in the upper plots have a mass cut on the 
jet; the lower plots include a subjet mass cut. The details of these cuts are described 
in Sec. 5.4, 
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from background. Reducing these systematic effects can also improve tfie mass reso- 
lution of the jet, which will aid in identifying a heavy particle decay reconstructed in 
a jet and in rejecting the QCD background. 
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Chapter 5 

IMPROVING HEAVY PARTICLE SEARCHES WITH JET 
SUBSTRUCTURE: "JET PRUNING" 

5.1 Pruning: "Cleaning up" jet substructure 

We now define a technique that modifies the jet substructure to reduce the systematic 
effects that obscure heavy particle reconstruction.^ In general, we will think of a 
pruning procedure as using a criterion on kinematic variables to determine whether 
or not a branching is likely to represent accurate reconstruction of a heavy particle 
decay. This takes the form of a cut: if a branching does not pass a set of cuts on 
kinematic variables, that recombination is vetoed. This means that one of the two 
branches to be combined (determined by some test on the kinematics) is discarded 
and the recombination does not occur. 

In Sec. 4.4, we identified recombinations that are unlikely to represent the recon- 
struction of a heavy particle. These can be characterized in terms of the variables z 
and AR: recombinations with large AR and small z are much more likely to arise 
from systematic effects of the jet algorithm and in QCD jets rather than heavy parti- 
cle reconstruction (compare the upper and lower figures in Fig. 4.23). We expect that 
removing (pruning) these recombinations will tend to improve our ability to measure 
jet substructure, including subjet masses. We also expect that this procedure will 
systematically shift the QCD mass distribution lower, reducing the background in 
the signal mass window. Finally this procedure is expected to reduce the impact 
of uncorrclated soft radiation from the underlying event and pile-up. We therefore 
define the following pruning procedure: 

^This section is taken, with small modifications, from Sec. VI of [2J. 
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0. Start with a jet found by any jet algorithm, and collect the objects (such as 
calorimeter towers) in the jet into a list L. Define parameters Dcut and ^cut for 
the pruning procedure. 

1. Rerun a jet algorithm on the list L, checking for the following condition in each 
recombination i,j p: 

^^ unn{pr.,prj) ^^^^^ and Ai?., > (5.1) 

PTp 

This algorithm must be a recombination algorithm such as the CA or kx algo- 
rithms, and should give a "useful" jet substructure (one where we can mean- 
ingfully interpret recombinations in terms of the physics of the jet). 

2. If the conditions in 1. are met, do not merge the two branches 1 and 2 into p. 
Instead, discard the softer branch, i.e., veto on the merging. Proceed with the 
algorithm. 

3. The resulting jet is the pruned jet, and can be compared with the jet found in 
Step 0. 

This technique is intended to be gencrically applicable in heavy particle searches. 
It generalizes analysis techniques suggested by other authors, including "filtering" 
[15j and "top-tagging" [18j, in that these methods also modify the jet substructure 
to assist separate a particular signal from backgrounds. In particular, the use of 
the variables z and ARij follows the use of Sp and Sr in [18j, with the significant 
difference that Sp measures softness relative to the total jet, and we define z to be a 
"local" variable that only depends on the two protojets being recombined. A more 
important distinction is that filtering and top-tagging are designed to find a specific 
number of subjets to map onto a specific decay, whereas pruning is intended to be 
applied to an entire jet with no bias toward a specific substructure configuration. 
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While we think this generahty is novel, we emphasize that pruning is an evolution 
from earher methods and relies on the same physical effects. We have endeavored to 
justify our claim for generality with the discussions in Chapter 4, which demonstrate 
that the interpretation of jet substructure is subject to generic systematic effects that 
can be well characterized. Pruning is not the only option, but offers some advantages 
which we explore in further studies below. 

In the analysis of pruning, we will explore the dependence of the pruned jets on 
the value of D from the jet algorithm. When reconstructing a boosted heavy particle 
in a single jet, without pruning the reconstruction is optimized if the vahic of D is 
fit to the expected opening angle of the decay. However, this angle depends on the 
mass of the particle (which is not known in a search) and its pt- We will show that 
pruning reduces the sensitivity to D and allows one to use large- jets over a broad 
range in px to search for heavy particles. 

Values for the two parameters of the pruning procedure, ^cut and -Dcut; can be 
well motivated. In the following studies, we will show that the results of pruning 
are rather insensitive to the parameters, and that the optimal parameters are similar 
for different searches. That is, it is not necessary to tune the pruning procedure for 
individual searches. 

The parameter Zcut can be chosen based on the analysis of single-step and multi- 
step decays in Sec. 4.1.2, Near the limit in boost where decays are reconstructed in 
a single jet, the value of z is typically large. It is only at large boosts, where the 
production rate of heavy particles is much smaller, that small values of z are allowed 
for reconstructed decays (see Fig. 4.10). Therefore, we can choose a value of ^cut that 
will keep all reconstructed parton-level decays at small boost, and only remove a small 
fraction of decays at larger boosts. We expect that a Zc-at ~ 0.10 will be a reasonable 
compromise. Note that Fig. 4.23a indicates that much of the soft radiation distorting 
the substructure for CA jets has z < 0.02, so that at least for CA a ^^cut not much 
bigger than this should be effective. 
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The parameter Dcut can be determined on a jet-by-jet basis, allowing pruning to 
be more adaptive than a fixed-parameter procedure. Dcut determines how much of the 
jet substructure can be pruned, with smaller values allowing for more pruning. Dcut 
should be sufficiently small so that if a decay is "hidden" inside the jet substructure 
by late recombinations of, say, UE particles, the substructure can be pruned and 
the decay can be found. A value that is too small, however, will result in over- 
pruning. A natural scale for Dcut is the opening angle of the jet. However, this is 
an infrared unsafe quantity, as soft radiation can change the opening angle. Instead, 
the dimensionless ratio mj/pTj for the jet is related to the opening angle: typically, 
Ai?i2 ~ 2mj/pTj- Therefore, we choose L'cut to scale with 2mj/pTj- -Dcut = fnj/PTj 
is a reasonable starting value. 

5.2 Effects of pruning in e+e~ collisions 

Having defined the pruning procedure, wc now wish to study its effects. In this study, 
we use the parameters Dcut = "^j/ptj for both algorithms, and Zcut — 0.10 for the 
CA algorithm and 0.15 for the kx algorithm. We will motivate these parameters in 
Sec. 5.5.1, 

We begin with jets in e+e" collisions as a basehne. Although we expect pruning 
will be most useful at hadron colliders, it is instructive to consider how it affects 
jets in a simpler environment. In Fig. 5.1 we show the distribution in substructure 
kinematics for e"^e~ — >■ tt events. 

For the kx algorithm, pruning does not significantly affect the kinematics of the 
final branching. Pruning only removes soft, wide-angle mergings, which rarely occur 
as the last merging in a kx jet. The small reduction in the ai peak corresponds to 
occasionally identifying the W + b merging correctly but discarding the b for being 
too soft. That pruning does occasionally happen can also be seen in the depletion of 
jets with z < 0.15, the softness cutoff used in these plots. 

For CA on the other hand pruning has a large effect. Nearly 20% of unpruned 
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Figure 5.1: Distributions in Ai?i2, and ai for pruned and unpruned jets in e^e — j- 
ti events. 
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jets had z < 0.01; these mergings have nearly all been eliminated. (The requirement 
that only mergings with Ai?i2 > Dcut can be pruned means that some jets survive 
with z < Zcut-) Since the final merging(s) of CA jets are often pruned, we see that the 
distributions in AR12 and ai are shifted. Jets with z ~ have ai ~ 1, so this peak has 
disappeared. The typical final opening angle has also been shifted downward. The 
double peaks correspond to the kinematically typical opening angles for top quark 
and W boson decays at this px- 

In Fig. 5.2 we show the same plots for the e+e~ — > qq sample. For both algorithms 
pruning has a significant effect, z and AR12 arc pushed toward zero, indicating that 
all but a very narrow hard core of the QCD jets arc being pruned away. For each 
variable, jets with only one constituent are included in the zero bin, which explains 
the excess at 2; ~ for kx. The distributions in ai suggest that asymmetric mergings 
are pruned away from jets until all that remains are a few reasonably symmetric 
low-mass protojets. 

Of course the most important effect of pruning is on the jet mass distribution, 
which we plot in Fig. 5.3 for the tt sample and in Fig. 5.4 for the dijet sample. In 
the signal sample, the unpruned algorithms already performed quite well at finding 
tops, and pruning degrades this somewhat. Note that the W peak increases for both 
algorithms, indicating we sometimes prune a top down to a W. In the background 
sample, pruning shifts the mass distribution down considerably, but the effect is 
negligible in the top mass window — the jets with masses this large are not affected 
by pruning. We can conclude that pruning is probably not very useful in a search 
for top quarks in this case, although it might be useful in a search for decays with a 
smaller m/pT- 

5.3 Effects of pruning in pp collisions 

We saw in Sec. 4.3 that hadron collisions are noisier than electron collisions, with 
radiation coming from initial partons as well as multiple interactions of beam rem- 
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Figure 5.2: Distributions in z, AR12, and ai for pruned and unpruned jets in e~^e — j- 
qq events. 
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Figure 5.3: Distribution in mj for pruned and unpruned jets in e^e — )■ tt events, 
using the CA and kx algorithms. Jets have px > 500 GeV and D = 1.0. 
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Figure 5.4: Distribution in mj for pruned and unpruned jets in e~^e — )■ qq events, 
using the CA and kx algorithms. Jets have pt > 500 GeV and D = 1.0. 
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nants. Pruning is intended to remove as much of this "extra" radiation as possible, 
so we now repeat the analysis of the previous section for pp event samples to see its 
effects. We compare pruned and unpruned jets, acting on the "FSR" (just final-state 
radiation) and "FSR+ISR+UE" (full simulation) samples from Sec. 4.3. We might 
hope that pruning, acting on jets in the latter sample, would yield results similar to 
the former. In fact pruning systematically shifts the kinematics of both samples, but 
in such a way that the end results are similar: pruned FSR jets are remarkably similar 
to pruned FSR+ISR+UE jets. 

In Fig. 5.5 wc plot substructure kinematic distributions for pruned and unpruned 
jets in the pp tt samples. As in the e+e~ events, pruning removes soft, large- 
angle radiation and hence depletes the small-z, Iarge-Ai?i2 regions of phase space. 
The z fa and ai 1 peaks for CA disappear, while for kx the substructure is 
largely unaffected. Unlike in e'^e~ events, the ai peak is strongly enhanced for both 
algorithms: pruning improves our ability to resolve a W subjet. For CA, it is notable 
that while including ISR and UE shifts the substructure distributions — particularly 
ARi2 — this difference is greatly reduced after pruning. Pruning is largely removing 
the effect of extra radiation. 

In Fig. 5.6 we show the same plots for the matched pp — )■ jets samples. As in the 
e'^e~ events, wc can see that jets are being "pruned back" to have small Ai?i2 and ai, 
with a spike at AR12 — representing jets with only one constituent left. As in the 
tt sample, the FSR and FSR+ISR+UE distributions are more similar after pruning 
than before. 

We arrive at last at the key metric for pruning: jet masses in pp events. In Fig. 5.7 
we plot jet masses before and after pruning for the tt samples; in Fig. 5.8 we show 
the same plots for the multijet background samples. In the signal sample we see that 
pruning narrows the peak near the top mass, especially for kx. The peak for pruned 
FSR+ISR+UE jets is not as sharp as for FSR jets, but pruning provides a clear 
improvement. Recall from Sec. 4.3 that the separation in FSR/ISR/UE is to some 
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Figure 5.5: Distributions in 2;, Ai?i2, and ai for pruned and unpruned jets in pp — )■ tt 



events. 
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Figure 5.6: Distributions in Ai?i2, and ai for pruned and unpruned jets in matched 
pp — )■ jets events. 
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Figure 5.7: Distribution in mj for pruned and unpruned jets in pp — )■ tt events, using 
the CA and kx algorithms. Jets have pt > 500 GeV and D = 1.0. 
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extent artificial, and we should not expect any method on fully simulated events to 
reproduce the simplicity of the FSR sample. 
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Figure 5.8: Distribution in mj for pruned and unpruned jets in matched pp — ?■ jets 
events, using the CA and kx algorithms. Jets have px > 500 GeV and D = 1.0. 



In the background mass plots, we can see that unlike in the e+e^ case, here pruning 
lowers the number of jets in the top mass window. The distinction between e~^e~ and 
pp jets is related to the contrast between FSR and FSR+ISR+UE jets. As for e~^e~ 
events, pruning has little effect on high-mass jets in the FSR sample. Here large jet 
masses are presumably coming from hard, large-angle radiation that pruning cannot 
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remove. Recall that the pp sample is a matched sample with 2, 3, or 4 final state 
partons. By contrast, in the FSR+ISR+UE sample, moderately heavy jets have their 
masses increased by the inclusion of additional radiation from the rest of the event, 
pushing them into the top mass window. Pruning can remove this radiation, moving 
these jets back out of the top window and reducing the background to the top quark 
signal. 

5.3.1 Parton-hadron comparison 

Finally, it is instructive to revisit the "parton-hadron" comparison from Sec. 4.4 ^ In 
Fig. 5.9, we reproduce Fig. 4.23, using pruning at both the hadron and parton level. 
The parton-level pruning is implemented in the same way as defined above, treating 
the three partons of the reconstructed top quark as the jet. 

By comparing Figs. 4.23 and 5.9, we again can see that pruning has removed much 
of the systematic effects in the CA algorithm; when only a jet mass cut is made, the 
distribution in z and Ai?i2 for pruned jets match the parton-level distribution much 
better than unpruned jets. When both mass and subjet mass cuts are made, pruning 
shows a slightly poorer agreement to the parton-level kinematics than the unpruned 
case. Note however that for pruned jets, the efficiency of the subjet mass cut is 
considerably greater since we more often identify one of the daughter sub jets as a 1^ 
(see the discussion of Fig. 5.11 in Sec. 5.5.1). 

We move on to examine pruning through a set of studies using Monte Carlo sim- 
ulated events. We will investigate the parameter dependence of pruning, motivating 
the parameters used above. We will extensively study both top and W reconstruction 
with pruning, and quantify the improvements from pruning in terms of basic statis- 
tical measures. These studies will provide evidence of the insensitivity of pruning to 
the value of D in the jet algorithm. 

^This subsection is taken, witli small modifications, from Sec. VIA of [2_. 
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Figure 5.9: Distributions in z and Ai?i2 comparing top quark decays at the parton- 
level and from Monte Carlo events, after implementing pruning. This figure uses the 
same samples and cuts as Fig. 4.23, 
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5.4 Study overview 

The parameter space for questions about pruning procedures is very large. ^ We will 
not be able to answer all possible questions, but we will attempt to answer the most 
important. We use Monte Carlo samples to study W reconstruction and the rejection 
of W + jets backgrounds, as well as top quark reconstruction and the rejection of 
QCD multijet backgrounds. To test the usefulness of pruning across a range of jet 
m/pT, and hence the heavy particle boost, we study both signals in four pt bins. We 
will also be able to compare a signal with a single mass scale (the W) to one with two 
(the top). The details of the Monte Carlo samples and their generation are described 
in Appendix C, 

In the following sections, we define a particular method to identify the heavy 
particles using jet substructure, and examine pruning in this context. We are more 
concerned with the improvements provided by pruning than its absolute performance. 
Therefore, we compare pruning to an analysis procedure where the jets are left un- 
pruned. This comparison removes dependence on quantities that have large uncer- 
tainties, such as signal and background cross sections, or are not specified, such as 
the integrated luminosity. Instead, the performance of pruning is quantified in terms 
of how much better pruning resolves the physically relevant substructure of the jet 
and separates signal and background processes versus using the substructure from 
unpruned jets. 

Additionally, we test the performance of pruning as parameters of the jet algorithm 
and the pruning procedure arc varied, including D. We expect the D dependence to 
be closely correlated with the jet pr, as it is a direct measure of the boost of the 
heavy particle. We aim to draw some basic conclusions about how pruning should be 
applied in a search. 



''This section is taken from Sec. VII of [2j. 
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5.4- i Measures used to quantify pruning 

Mass variables are by far the strongest discriminator between QCD jets and jets 
reconstructing heavy particle decays. QCD jets have a smooth mass distribution set 
by the jet px (see Sec. 4.1.1), while a decaying particle can have multiple intrinsic 
mass scales. We define simple criteria to identify a jet as coming from a top quark: if 
the jet mass is in the top mass window and one of the two sub jets has a mass in the 
W mass window, then we tag the jet as a top jet. The top and W mass windows are 
defined by fitting the relevant mass peaks of the signal sample, which we describe in 
detail below. The W study proceeds analogously with only a jet mass cut. In a real 
search for a particle of unknown mass, one obviously cannot fit a "signal sample". 
However, we employ this method to demonstrate two effects of pruning: sharpening 
the signal mass peak and reducing the QCD background in this region. These two 
effects will determine how well pruning improves our ability to find bumps in jet mass 
distributions. 

We use a common set of variables to measure the difference between a jet algorithm 
and its pruned version. Let A^s(^) be the number of jets in the signal sample identified 
as a reconstructed heavy particle for algorithm A, and A^b(^) the analogous number 
of jets in the background sample. Use pA to denote the pruning procedure run on 
jets found with algorithm A. Then the variables we use are: 

_ N,{pA) 
' Ns{A) ' 

Ns{A)/N^{A) 
^^ Ns{pA)/^K{pA) 
iVs(A)/yiVB(A) ■ 

e is the relative efficiency of pruning in identifying heavy particles in the signal sample, 
while R and -S" are the relative signal-to-background and signal-to-noise ratios for the 
pruned and unpruned algorithms. We also evaluate the relative mass window widths, 
which we label Wrei- For the W study, this is the ratio of the W mass window width 
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for pruning relative to not pruning; for the top study it is the ratio in the top mass 
window width. Note that in the top study, a W subjet mass cut is also used. A 
value of Wrei < 1 means pruning has improved the mass resolution of the jets. These 
ratios are independent of the integrated luminosity and the total cross sections, and 
are representative of the improvements that pruning would provide in an analysis. 

To determine the mass window for a particular signal sample, we fit the mass peak 
to determine the window width. In these studies, a skewed Breit-Wigner is sufficient 
to fit the peak, with a power law continuum background. These functions used to fit 
mass peaks are: 



peak: /(m) 



(m2 - M2)2 + M2r2 



(a + b{m - M)) 



, , c d 

contmuum: g{m) = 1 

m 



M is the location of the mass peak; F is the width of the peak. A sample fit it shown 
in Fig. 5.10, The mass window [M — F, M + F] is found to be nearly optimal, given 
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Figure 5.10: A sample fit showing the jet mass distribution (black histogram) and 
sample fit (blue curve) for CA jets from ti events. 



this functional form, in measures similar to e, R, and S: the area in the window (~ e), 
the ratio of area to the window width (~ R), and the ratio of area to the square root 
of the width (~ S). 
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5.5 Results 

In this section we present results comparing analyses with pruned jets to unpruned 
jets.^ We demonstrate two main points: first, pruning is useful and broadly applica- 
ble, and second, its parameters do not need fine tuning for it to provide significant 
improvement. 

The natural starting point is to investigate the parameters particular to the prun- 
ing procedure, Dcut and ^cut- The most important question is whether these need to 
be tuned to the signal. To answer this, in Sec. 5.5.1 we study the performance of 
pruning as we vary its parameters for two different signals across the full pt range for 
the samples. We find that optimal choices of 2;cut and Dcut vary slowly with m/pr, 
but that our choice of parameters is not far from optimal in all cases. 

After fixing Zcnt and D^ut, we consider the effect of varying D in the jet algorithm. 
In Sec. 5.5.2 we study pruning with D fixed at 1.0 over all px bins. This type of 
analysis is like a search where the mass (and hence tti/pt) of the new heavy particle 
is not known. For comparison, in Sec. 5.5.3 wc redo the analysis, but with D adjusted 
for each bin to fit the expected angular size of the decay in that bin. In this case, 
the unpruned jet algorithm performs better than with a constant D, as expected, but 
pruning still shows improvements in finding 1^'s and tops. In all cases, pruned jets 
are a better way to identify heavy particles than unpruned. In Sec. 5.5.4 we compare 
the results of Sees. 5.5.2 and 5.5.3. Significantly, if jets are pruned, we find that it 
does not make much difference what the initial D value was, indicating that searches 
with large fixed D do not suffer in power compared to searches with D tuned to 
known or suspected m/pT- 

In Sec. 5.5.5 we give some absolute measures of top-finding with pruned jets for 
comparison to other methods. In Sec. 5.5.6 wc directly compare the CA and kx 
algorithms, before and after pruning. Finally, in Sec. 5.5.7 we consider the effect of a 

^This section is taken from Sec. VIII of 2 . 
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crude detector model where we smear the energies of all particles in the calorimeter. 
We find that the performance of the pruned and unpruned algorithms are degraded, 
but that pruning still provides significant improvement. 

5.5.1 Dependence on Pruning Parameters 

The pruning procedure we have defined has two free parameters (in addition to those 
of the jet algorithms themselves). In introducing the procedure, we argued that 
Zcut = 0.10 and D^ut = itT'j/ptj were sensible choices. We now investigate how pruning 
performs when each of these parameters is varied while the other is held fixed, for 
both {W and top) signals and across the four pt bins for each signal. 

We will look at the values of the metrics tWrei, ^-nd S defined in Sec. 5.4.1, The 
priority in choosing particular values for Zcut and Dcut should be in optimizing S, as 
it is the criterion for discovery. That being said, e and R are still important measures 
as they determine the total size of the signal and remaining fraction relative to the 
background. We also evaluate u^rei because the mass window width drives the other 
three metrics. As the relative width decreases, in general the measures it! and S will 
increase because the heavy particle is better resolved and more of the background is 
rejected, but e will tend to decrease simply because the narrower width selects fewer 
signal jets, e can, however, increase with decreasing mass window width if enough 
high-mass signal jets are being pruned into the mass window. 

In Fig. 5.11, we show all four metrics for top and W jets, for both CA and kT jets. 
Dcut is set to mj/pTj throughout, and ^cut is varied in [0, 0.25]. 2;cut = represents 
no pruning and we can see that all metrics are 1 here. With increasing pruning, the 
mass window width initially decreases rapidly, then levels out. In all but the smallest 
Pt bin, the relative signal efficiency e increases as the width narrows, suggesting that 
signal jets that had "vacuumed up" too much UE or soft radiation are being pruned 
back into the mass window. Note that for the top quark sample with the kx algorithm, 
e merely flattens out for a range in z^nti and does not increase as it does for the other 



127 



» 0.4 



^ 0.6 
i 0.4 



0.00 0.05 0.10 0.15 0.20 025 



0.00 0.05 0.10 0.15 0.20 0.25 




0.00 0.05 0.10 0.15 0.20 0.25 
Zeat 



0.00 0.05 0.10 0.15 0.20 0.25 



1.01^ 



0.00 0.05 0.10 0.15 0.20 0.25 



0.00 0.05 0.10 0.15 0.20 0.25 



0.00 0.05 0.10 0.15 0.20 0J5 



0.00 0.05 0.10 0.15 0.20 
Zeat 



0.00 0.05 0.10 0.15 0.20 0.25 




0.00 0.05 0.10 0.15 0.20 0.25 
1.0 h, 



0.00 0.05 0.10 0.15 0.20 0.25 



0.00 0.05 0.10 0.15 0.20 0.25 



top sample 

200 - 500 

--- 500-700 
— - 700 - 900 
900 - 1100 



W sample 

125 - 200 

--- 200 -275 
— - 275 - 350 
350 - 425 



0.00 0.05 0.10 0.15 OJO 0.25 
Zoit 



(a) W's, CA jets (b) tops, CA jets (c) W's, Ict jets (d) tops, kx jets 



Pt ranges in GeV 



Figure 5.11: Relative statistical measures Wrei, R, and S vs. ^cut for W^s and tops, 
using CA and kx jets. Four pt bins are shown for each sample. Statistical errors (not 
shown) are 0{1%) for w^ei and e, and 0(10%) for R and S. 
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samples. Once the window stops shrinking significantly (around ^^cut — 0.05), the 
relative signal efficiency starts decreasing; now the dominant effect is over-pruning 
signal jets out of the mass window. Note, however, that even though the relative 
signal efficiency is decreasing, the relative signal-to-background ratio R is increasing 
over the full range. So even as signal jets arc being removed from the mass window, 
background jets are being removed even faster. If we look at signal-to-noise, there 
appears to be a broad optimal range in z^nt that depends somewhat on the signal, on 
the pt bin and on the jet algorithm. 

There are two important lessons to be learned from these plots. First, more 
pruning is required for kx jets than for CA to achieve similar results. The right two 
columns (kx) are similar to the left two (CA) except that features are shifted out in 
Zcut- Second, the peak in S does not depend strongly on the signal or the px, in the 
three largest pr bins. The dependence on S in the smallest pr bin, however, is different 
from the others due to threshold effects of the heavy particle being reconstructed in 
a single jet. In this bin, the boosts of the PF's or tops are small enough that many 
decays are just at the threshold for being reconstructed. Decays at the reconstruction 
threshold typically have poor mass resolution, and cutting more aggressively on z 
reduces these threshold effects and significantly decreases the background, leading to 
an increase in S over the whole range in Zcut- For CA, our "reasonable choice" of Zcut 
of 0.10 looks close to optimal for the upper three bins, and not far off for the smallest. 
For kx, a larger Zcut is needed; 0.15 is close to optimal. 

Additionally, these plots offer an interesting perspective on the role of z in jet 
substructure. The ti sample for the CA algorithm is the most instructive. In this 
case, small values of Zcut lead to dramatically increased efficiency for finding top jets 
in the larger pt bins. This is due to the improved ability after pruning to find the 

as a sub jet of the top. At large pt with a fixed D — 1.0, the opening angle of 
the top quark decay is much smaller than D. This means that the top quark decay 
is very localized in the jet, and much of the jet area includes soft radiation. For the 
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CA algorithm, which recombines solely by the angle between protojets, this tends to 
delay recombining the soft peripheral radiation until the end of the algorithm. The 
result is substructure with small z at the last recombination that is not representative 
of the top quark decay — neither daughter protojet of the top has the W mass. As an 
illustration of this point, in Fig. 5.12 we plot the distribution of z for unpruncd jets 
in the top mass range for the CA algorithm in the largest and smallest bins. Note 
that in the largest pt bin, where the top quark decay is highly localized in the jet and 
the decay angle is much less than there is a substantially increased fraction of jets 
with a small value of z. This does not occur in the smallest ■pr bin, where most of 
the reconstructed tops are at threshold for being just inside the jet. When pruning is 
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(a) VT bin 1, 200-500 GeV (b) vt bin 4, 900-1100 GeV 



Figure 5.12: Distribution in z for unpruned CA jets in the top mass window for two 
bins. The small bin distribution (left) has only a small enhancement of entries 
at small z^ while the large p^ bin distribution (right) is dominated by small z. 

implemented, however, much of this soft radiation is removed. In Fig. 5.13, we plot 
the same distributions as in Fig. 5.12, but for pruned jets. In this case, no jets with 
the top mass have small since pruning has removed those recombinations. This 
leads to a highly enhanced efficiency to resolve the W sub jet and identify the jet and 
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a top jet. In Sec. 5.5.2, we will study pruning when the value of D is matched to the 
average angle of the heavy particle decay, and we will see that the performance of the 
unpruned CA algorithm improves. 




0.0 0.1 0.2 0.3 0.4 0.5 0.6 
z 



Figure 5.13: Distribution in z for pruned CA jets in the top mass window for two pt 
bins, using z^ut = 0.10. 

By contrast, this situation does not occur for the kx algorithm. Even when the 
value of D is mismatched with the top quark decay angle, the soft radiation on 
the periphery of the jet is recombined early in the kx algorithm because of the pt 
weighting in the recombination metric. Therefore, there is no increase in efficiency 
with increasing ^cut for large pt, and the decrease in e comes from the narrower width 
of the top and W mass distributions. The small variation in the measures R and S 
for the kx algorithm at small z^ut is evidence of the fact that kx tends to have many 
fewer small- 2; recombinations at the end of the algorithm, and supports the larger 
value of Zcut = 0.15 for the kx algorithm that we will use in the remainder of the 
study. 

We now fix z^ut to study the dependence on -Dcut- For the CA algorithm we 
choose Zcut = 0.1, and for kx we choose 0.15. In Fig. 5.14, we plot w^-ei, e, R, and S 
as Z^cut is varied in [0, 5mj/pTj]- While Zcut sets the minimum pt asymmetry that 
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Figure 5.14: Relative statistical measures Wrei, e, R, and S vs. Dc-at/ for H^'s and 
tops, using CA and kx jets. Four pt bins are shown for each sample. Statistical errors 
(not shown) are 0{1%) for Wy-^i and e, and C(10%) for R and S. 
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recombinations can have, D^ut sets the minimum opening angle for recombinations 
that can be pruned. We can think of Dcut as determining which recombinations can 
be pruned, and Zcut as determining whether or not that pruning takes place. This 
difference is clearer when we consider two limiting values of Dcut and their impact on 
the pruned jet substructure. 

As Dent grows past 2mj/pTj, any recombination must have a large opening angle 
between the daughters to be pruned. Note that the limit Dcut — > oo is the limit 
of no pruning. For both the CA and kx algorithms, in this limit only very late 
recombinations in the algorithm can be pruned (if the jet can be pruned at all). In 
this limit, we expect the statistical measures to tend to one as the amount of pruning 
decreases. 

The second limit is Dcut ~^ 0. In this limit any recombination can be pruned, since 
the minimum opening angle needed is very small. As i^cut decreases towards zero, 
more of the jet substructure can be pruned. In particular, earlier recombinations — 
those with smaller opening angle on average — can be pruned as D^ut decreases. In 
general, these early recombinations are associated with the QCD shower, and pruning 
them can degrade the mass resolution of the jet because too much radiation is being 
removed. Therefore, we expect the performance of pruning to be poor in this region. 

Both of these limits are present in Fig. 5.14, and our expectations about these 
limits are correct. It is in the intermediate region, where D^nt ~ i^j/PTji that the 
performance of pruning is optimal, with a maximum in S that is not very sensitive 
to the pt bin, sample, or algorithm. This value of £)cut — "^jIvtj is sensible when 
we recognize that the average opening angle of the jet is approximately 2mj /ptj, and 
half this value allows for pruning of late recombinations but not the soft, small-angle 
recombinations associated with the QCD shower. 

For the remainder of the study, we fix the pruning parameters z^^yix — 0.1 for the 
CA algorithm and z^nt — 0.15 for the kx algorithm, as well as D^nt — "^jIvtj for both 
algorithms. With these parameters fixed, we move on to discuss more interesting tests 
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of the pruning procedure. 



5.5.2 Top and W Identification with Constant D 
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Figure 5.15: Relative statistical measures w^-d, e, R, and S vs. Pt for W^s and tops, 
using CA and kx jets with D = 1.0. Statistical errors are shown. 



In a search for heavy particles decaying into jets, it may be unfeasible to divide 
a sample into pr bins and use a tailored jet algorithm to look for local excesses 
in the jet mass distribution in each p^ bin. (A "variable-i?" method for avoiding 
Pr-binning, which we do not consider here, has recently been suggested [91j. This 
still requires knowing or guessing the mass of the new particle, since it is m/pT that 
determines the relevant angular size.) For instance, the appropriate angular scale may 
be unknown because the mass of the heavy particle is not known or the production 
mechanism is not well understood (so that the spectrum of heavy particle boosts is not 
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known). In this large-D jet algorithm may be used to search for heavy particles 

reconstructed in single jets. To mimic such an analysis, and provide a reference point 
for further tests of pruning, we find our statistical measures for W and top quark jets 
with a fixed D of 1.0. 

In Fig. 5.15 we plot the values for w^ei, R, and S versus pt bin for W^s and 
tops, using the CA and kx algorithms.^ Pruning improves W and top finding for 
both algorithms, with substantial improvements for large pr- The measure S in 
the smallest pr bins ranges from 30-40%, growing to values between 100-600% in the 
largest pr bins. At large pt in the top quark study, the improvement in signal-to-noise 
for the CA algorithm is larger than for the kx algorithm, as is the relative efficiency 
to identify tops. This arises because the CA algorithm is poor at reconstructing the 
W as a, subjet of the top jet at large pr when the value of D is not matched to the 
opening angle of the decay. We will investigate this case further in the rest of the 
analysis. 

5.5.3 Top Identification with Variable D 

For an analysis where the heavy particle mass is known, the jet algorithm can be 
tailored to the jet pr- The D value can be chosen using the relation 



where m is the heavy particle mass and pt is the transverse momentum of the jet. 
We take 1.0 to be the maximum allowed value of D. The D values we use are given in 
Table 5.1. In Fig. 5.16, we plot w^ei, e, R, and S for jets with these D values used for 
each Pt bin. Note that Eq. (5.5.3) neglects the differences between algorithms, which 
depend on the particular decay. As an example of the fidelity of this relation for D, 
recall Fig. 4.13, which plotted the distribution in AR for reconstructed parton-level 

^Thc statistical error bars shown are primarily due to the limited number of events in the back- 
ground sample after pruning. 
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Figure 5.16: Relative statistical measures Wreu ^-^d ^ vs. pr for Ws and tops, 
using CA and kx jets. Instead of a fixed D = 1.0, a, tuned D is used for each px bin 
(see Table 5.1). Statistical errors are shown. 
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"tuned" D 


1.0 
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Table 5.1: "Tuned" D values for W and top pr bins. The fixed-L> analysis used 
D — 1.0, so the smallest bin does not change. 
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top quark decays with a top boost of 7 3. Eq. (5.5.3) suggests the value D — 0.7, 
while the means of the CA and kx distributions for the reconstructed parton-level 
decay are 0.75 and 0.65 respectively. Because the distribution in opening angles of 
the reconstructed decay is broad, by using a smaller, fixed D some decays will not be 
reconstructed by the jet algorithm. 

The difference between the case of constant D — l.Q and variable D is readily 
apparent. When the D value is matched to the expected opening angle of the decay, 
the improvements in pruning are flatter over the whole range in pr, and generally 
decreasing towards high pt- The decreased efficiency for pruning, especially for the 
kx algorithm, is outweighed by the increases in R and S over the whole range in p^. 
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Figure 5.17: Relative statistical measures wd, ^d, Rd, and Sd vs. pr for W^s and 
tops, using CA and kx jets. The measures now compare pruning with a tuned D 
value in each px bin to pruning with a flxed D. Statistical errors are shown. 



137 



5.5-4 Comparing Pruning with Different D Values 

In the previous two subsections we saw that an unpruned analysis performs much 
better when D is tuned to the m/px of the signaL We now consider whether this is 
true of a pruned analysis. 

In each px bin, we can compare the results of pruned jets with D — 1.0 with 
pruned jets using value of D fit to the expected size of the decay. Because the naive 
expectation is that the tuned value of D will yield better separation from background, 
we find the improvements in pruning when D is tuned, relative to pruning with a fixed 
D of 1.0. Analogous metrics, wd, ^d, Rd, and Sd, are used, but now they compare 
the results from pruning with the tuned D value to the results from pruning with 
D — 1.0. For instance, 

^ _ S/B from pruning with tuned D 
^ S/B from pruning with D — 1.0 

Note that xd > 1 indicates that tuning D yields an improvement. The values of these 
four measures are shown in Fig. 5.17 over the range of pr-^ Note that since the tuned 
value of D in the smallest px bin is 1.0, the comparison there is trivial and so is not 
shown. 

These results show only small improvements in Sd, with the statistical error bars 
at most data points including the value So = 1- They indicate that the results after 
pruning are roughly independent of the value of D used in the jet algorithm, as long 
as that D is large enough to fit the expected size of the decay in a single jet. From 
the point of view of heavy particle searches, we can conclude that pruning removes 
much of the D dependence of the jet algorithm in the search. 

^The statistical errors now have significant contributions from both pruned background samples. 

Each "measurement" compares the results of two methods, where each method has an associated 
uncertainty (the error bars in Figures 5.15 and 5.16). These errors are not independent because 
the same initial background sample is used in each case. The combined uncertainties in this figure 
assume that the individual errors are indepcindent, so should be viewed as an upper bound and at 
best a rough estimate of the statistical uncertainty. 
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5.5.5 Absolute Measures of Pruning 

So far, we have only considered measures of pruning relative to a similar analysis 
without pruning, because this factors out much of the dependence on details of the 
samples. However, several recent studies report absolute performance metrics for 
heavy particle identification, so we examine similar measures here for completeness. 
In addition, we directly compare the CA and Ict algorithms, with and without pruning. 

As can be seen from the plots of Wjci in previous sections, pruning reduces the 
width of the mass distribution for heavy particles. In Fig. 5.18, we plot the absolute 
widths of the fitted mass distributions for both the top and W in the ti sample and 
the W in the WW sample, over all px bins. We plot this width for the pruned and 
unpruned version of the CA and kx algorithms. 

Note that the heavy particle identification method we use in this work selects jets 
within a range of width 2r, with F coming from a fit to the signal sample. This gives 
rise to a mass range cut that is typically much narrower than fixed width ranges used 
in other studies, and hence the absolute efficiency to identify heavy particles is lower. 

In Figs. 5.19a and 5.19b, we plot the absolute efficiency to identify tops and l^s 
in the two signal samples for both algorithms, with and without pruning. For the top 
sample, this efficiency eabs is the ratio 

_ # of top jets in the signal sample 
# of parton-level tops in the pt range 

for each pt bin, with eabs defined analogously for the W sample. Because the sub- 
structure of the W decay is much simpler than the top decay, with no secondary mass 
cut, the absolute identification efficiencies are similar between all algorithms. 

The efficiency to find top quarks is only meaningful when compared to the fake 
rate for QCD jets to be misidentified as a top quark. We define this fake rate as 
_ # of fake top jets in the background sample 

6fa,ke — 

# of unpruned jets in the pr range 
for each pr bin, and analogously for the W sample. In Figs. 5.19c and 5.19d, we plot 
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Figure 5.18: Widths of the top jet (a), W subjet of the top jet (b), and W jet (c) 
mass windows for the top and W signal samples. 
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ffake for tops and Ws in the two background samples for both algorithms, with and 
without pruning. The fake rate is significantly reduced for pruned jets compared to 
unpruned jets, for both the top and W studies. The decrease in absolute efficiency 
arising from using a narrow mass window is compensated by a correspondingly small 
fake rate for QCD jets. 
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Figure 5.19: eabs sjid efake vs. Pt bin, for the CA and kT algorithms with and without 
pruning, using D = 1.0. A "p" before the algorithm name denotes the pruned version. 
The legend for figure (a) applies to figures (b) and (d) — note the scale difference for 
kx jets in (c). 



For top quarks, the efficiencies shown in Fig. 5.19 can be compared with those 
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given in Table 5 of [45] for several other top-finding methods. Our highest pt bin is 
relevant for the comparison. More than a few words of caution are in order, however. 
Unlike the pruning-to-not-pruning comparisons we have presented so far, comparisons 
between methods using absolute efficiencies will depend on the details of the signal 
and background samples, as well as the details of the various cuts included in each 
analysis. For example, the cuts we have used in this analysis are narrower than 
fixed mass window cuts used in other top-finding algorithms, and hence our top 
identification efficiency and background fake rate are both lower than described in 
other methods. We intend to perform a more thorough comparison between different 
substructure approaches in a future work. 
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Figure 5.20: Relative statistical measures comparing CA to kx jets and pruned CA to 
pruned kx jets vs. pt for PF's and tops, using D = 1.0. Statistical errors are shown. 
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5.5.6 Algorithm Comparison 

Throughout this paper, we have studied how pruning compares to not pruning for 
the CA and kx algorithms. However, it is also of interest to study how the CA and 
kx algorithms compare, with and without pruning. To do this, we use statistical 
measures wa, ^a, Ra, and Sa analogous to Wrei, R, For instance, 

^ _ S/B from the CA algorithm with D = 1.0 
^ S/B from the kx algorithm with D = 1.0 

We will change the subscript to pA to compare the pruned versions of the algorithms, 
e.g., 

^ ^ S/B from pruned CA with D = 1.0 
S/B from pruned kx with D = 1.0 

In Fig. 5.20, we plot the measures comparing CA to kx and pruned CA to pruned kx 

for both the WW and tt samples. 

These comparisons illustrate many of the effects that we have observed throughout 
this paper. For the unpruned algorithm comparison, CA tends to have a much lower 
efficiency to identify tops than kx. As pr increases, CA performs more poorly relative 
to kx, with the efficiency decreasing significantly. This arises because the CA has a 
decreasing efficiency to identify the W at high pt, when the top quark becomes more 
localized in the fixed D jet. Pruning corrects for this, though the performance of CA 
relative to kx still decreases at high pt- 

The WW sample is instructive because it lets us compare the effectiveness of 
pruning between CA and kx across a wide range in pt- For the unpruned algorithms, 
the performance of CA relative to kx is fairly consistent over all px, reflecting the fact 
that W identification is simpler than top identification, with accurate mass recon- 
struction the only requirement. However, when the jets are pruned, the performance 
of pruned CA relative to pruned kx improves in the smallest px bin and worsens in 
the largest px bin, as compared to the performance of CA versus kx for unpruned 
jets. This skewing indicates that pruning is more effective for CA than kx at small 
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Pt, where threshold effects are important, and more effective for kT than CA at large 
Pt- 

5.5.7 Detector Effects 

So far, no detector simulation has been applied to our events aside from clustering 
particles into massless calorimeter cells. We now consider a technique that approx- 
imates the impact that detector resolution has on the effectiveness of pruning. We 
modify our top and jet analyses by smearing the energy E of each calorimeter 
cell with a factor sampled from a Gaussian distribution with mean E and standard 
deviation a given by 

We consider a parameter set motivated by the expected ATLAS hadronic calorime- 
ter resolution [92j, {a, 6, c} = {0.65,0.5,0.03}. One obvious effect of the detector 
smearing is degraded mass resolution. In Fig. 5.21, we show this effect by plotting 
the jet mass distribution for the tt sample in the first pt bin. Even after smearing, 
however, pruning improves the jet mass resolution. In Fig. 5.22, we plot the pruned 
and unpruned jet mass distribution for the tt sample in the first pt bin. Note that 
because the QCD jet mass distribution is smooth, only the overall size of the sample 
in the mass window changes, so we do not plot these distributions. 

If Fig. 5.23, we repeat the basic analysis of Sec. 5.5.2, applying the detector smear- 
ing described above. This figure can be compared to Fig. 5.15 from the previous 
analysis, which plots the same measures when no energy smearing is used. The im- 
provements are very similar to those for unsmeared jets, good evidence that pruning 
may retain its utility in a more realistic detector simulation or in real data. 
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Figure 5.21: Distribution in jet mass for tt events, with (dashed) and without (sohd) 
energy smearing. The jets have pt of 200-500 GeV and D = 1.0, and there is no 
pruning. 




Figure 5.22: Distribution in jet mass for pruned (dashed) and unpruned (sohd) jets, 
for tt events with energy smearing. The jets have pt of 200-500 GeV and D = 1.0. 
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Figure 5.23: Relative statistical measures w^eh R-i ^^'^ S vs. pt for W^s and tops, 
using CA and kx jets. Calorimeter cell energies are smeared as described in the text. 
Statistical errors are shown. 
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5.6 Relation to other methods 

To the best of this author's knowledge, the earhest paper addressing heavy particle 
identification with jet substructure is a 1994(!) paper by Michael Seymour [93j, which 
considers W finding in the context of a Higgs search. In addition to a mass cut on 
the W jet, cuts are applied on ARjj and ARjj. the angles from each subjet to the 
other and to the jet axis. To reduce the effect of the underlying event, a reclustering 
— filtering — procedure is applied. Germinal forms of the concepts of jet areas and 
variable R parameters are also discussed. Sub sole nihil novi est. 

Interest in substructure perked up again several years later with two papers by 
Butterworth, et al. [94, 95j, which proposed using the variable Z/Prjet = '^fj- This 
is the merging distance for the last step in the kx algorithm, expected to be 0{M'^) 
for the decay of a heavy particle in a single jet. The restrictions on the branching 
kinematics that appear in subsequent substructure methods are all variations of this 
idea. 

The "mass-drop filter" method proposed in [15j contained a novel feature: instead 
of using the kx algorithm to construct substructure and cut on the final djj, the "mass- 
drop" step involved discarding elements of the substructure from the top down. After 
first clustering with Cambridge-Aachen, the top-level merging is checked for a large 
mass drop (indicating that the mass of the merged jet is coming from the kinematics 
of a decay, not just a heavy subjet). If this cut fails, instead of rejecting the jet, 
the lighter subjet is discarded and the search continues on the heavier subjet. After 
discarding extraneous subjets in this way, the remaining jet is "filtered" in a method 
similar to [93j: the constituents of the jet are reclustered with a smaller R and the 
hardest three jets are kept. 

The "top-tagging" method proposed in [18j implements a variant of the mass- 
drop scheme for identifying the relevant substructure in a heavy particle jet, and 
by repeating the sub jet-splitting procedure twice also achieves some of the success 
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of filtering. The top-tagging procedure also involves finding jets with CA and then 
looking backwards through the merging history for a large-scale splitting. Branchings 
where one subjet is very soft are discarded; branchings where both subjets are soft 
or the subjets are too close together are "irreducible", and branchings were neither 
of these is true arc valid splittings. After looking for a valid top-level splitting, the 
procedure is repeated once on each subjet, resulting in up to four subjets. The tagger 
requires that at least three be found. 

Jet pruning can be thought of as a generalization of the subjet-identification step 
of top-tagging, but with two important distinctions. First, pruning is run from the 
bottom up, with any merging failing a kinematic cut being discarded as a jet is built 
up. Second, because the procedure is bottom-up, the kinematic comparisons are both 
local — in top tagging, a subjet is too soft if 2; = p'r^/p'^ is too small; in pruning the 
relevant value is 2; = p'rp/p'x'' 1 where (i -|- j') represents the merger of subjets i and j. 

The discussions in this chapter have demonstrated that pruning is a generic tool: 
it is successful on a variety of signals over a wide range in m/pr, and does not 
require fore-knowledge of the number of subjets expected or any particle masses. The 
precisely optimal parameters will depend on these details, but as we have seen (see 
Figs. 5.11 and 5.14) this dependence is not strong. 

Another "grooming" method, "jet trimming" was proposed in [20j. Trimming is 
similar to filtering, but instead of keeping some fixed number of subjets, subjets which 
contain at least some fraction of the jet's px are kept. 

Instead of using jet and subjet masses that have been improved by grooming, 
several studies have proposed other substructure variables to distinguish decays from 
heavy particles [96, 97j. These are generally based on the kinematics of the last few 
mergings in kx jets. 

An alternative to considering the properties of subjets is to use jet shape or energy 
flow variables, as in [52, 53, 98, 99, 100]. An interesting idea, "N-Subjettiness" , was 

described in [10 Ij that interpolates the number of subjets as a smooth jet shape. 
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Finally, one more difference between decays and QCD was exploited in [102J : color 
flow. The variable "pull" was shown to characterize the fact that a color singlet's decay 
products are color connected, whereas partons in a QCD jet are often color connected 
to other parts of the event. 

While all of these methods rely on similar physics, it turns out that combinations 
of them can be even more useful [103, 104, 105J. In fact, [105J found that it took 25 
substructure variables to saturate the improvement in W finding. 

Which of the various substructure methods is "best" is a largely open — and 
largely unanswerable — question, with the answer presumably depending on the sig- 
nal in question and potentially on details of the detector, luminosity, event topology, 
etc. The bewildered and justifiably irritated experimentalist is perhaps to be con- 
soled only with the assurance that tools such as the FastJet plugin mechanism and 
the SpartyJet package will make comparisons simple to perform. In the example 
SpartyJet analysis in Appendix D, I will show some comparisons between pruning 
and its relatives in top and W finding. 

5. 7 Using pruning 

For readers interested in using pruning in their own analyses, the author has released 
a software package, FastPrune [106J, to make this simple. The FastPrune package 
includes two simple means of including pruning in a jet analysis. A pruning Fast- 
Jet plugin allows the user to find pruned jets (specifying the finding and pruning jet 
algorithms, as well as the Zcut and iPcut parameters) in precisely the same manner as 
for any other jet algorithm. The latest version also includes a pruning tool for use 
with SpartyJet. This tool takes as input jets found with some other algorithm, 
and returns the pruned versions. In an analysis that compares pruned with unpruned 
algorithms, this saves the step of finding the unpruned jets twice (once for the un- 
pruned analysis and once as the first step of the pruning Fast Jet plugin). The use 
of these tools is described more fully in Chapter 6. 
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5.8 Summary 

In Chapter 4, we demonstrated that a variety of systematic effects shape the sub- 
structure of heavy particles reconstructed in single jets 7 We have identified regions 
in the variables z and AR where individual recombinations are unlikely to represent 
the kinematics of a reconstructed heavy particle. Specifically, soft, large-angle recom- 
binations are unlikely to arise from the accurate reconstruction of a heavy particle 
decay, and are likely to come from QCD jets, uncorrelated radiation, or systematic 
effects of the jet algorithm. For the CA algorithm, we have demonstrated that these 
soft, large-angle recombinations are a key systematic effect that shapes the substruc- 
ture of the jet, in particular the final recombinations. 

In this chapter wc have presented a procedure, called pruning, that eliminates 
soft, large-angle recombinations from the substructure of the jet. Using hadronically 
decaying top quarks and W bosons as test cases, we have demonstrated that the 
pruning procedure improves the separation between heavy particles decays and a QCD 
multijet background. We have motivated the parameters of the pruning procedure 
and demonstrated that they roughly optimize the improvements from pruning in our 
study for both top quarks and W bosons. 

Our studies on pruning have demonstrated many positive results of the procedure. 
In a heavy particle search, the jet is sensitive to the parameter D, and if the value 
of D is not well matched to the decay of a heavy particle then the ability to identify 
that particle in single jets is greatly reduced. Our results indicate that pruning re- 
moves much of the jet algorithm's dependence on D. Pruning shows improvements 
even when D is adjusted to fit the expected decay of the heavy particle. We have 
demonstrated that pruning largely removes the effects of the underlying event, as the 
underlying event mainly contributes soft, uncorrelated radiation that can be pruned 
away. Additionally, we have shown that the results of pruning are robust to a ba- 

^This section, with small modifications, is taken from Sec. IX of 2_. 
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sic energy-smearing applied to the calorimeter cells used to seed the jet algorithm. 
Finally, we have quantified absolute measures of the pruning procedure that can be 
used to compare to other jet substructure methods. 

It should be reiterated that pruning systematizes methods that have been proposed 
by other authors for specific searches. Pruning should be applicable to a wide range of 
searches, and is intended to be a generic jet analysis tool. We have detailed the ideas 
behind why pruning works and why it should be used, and presented an in-depth 
discussion of many of the physics issues arising when studying jet substructure. 

5.8.1 Future Prospects 

The conclusions in this chapter, like those for any analysis technique not demonstrated 
on real data, must be taken cautiously. This is especially true for studies like this one 
on jet substructure, where a majority of the work has been in exploring techniques 
that may — or may not — actually be useful in an experiment. However, new 
techniques like jet substructure offer great promise. All studies thus far indicate that 
jet substructure, and in general a more innovative approach to jets, will be a useful 
tool for understanding the physics in events with jets at collider experiments. 

The most obvious and immediate application of pruning, and jet substructure 
tools in general, is in rediscovery of the Standard Model at the LHC As the LHC 
collects data from high-energy collisions, there will be an abundant sample of high- 
Pt top quarks, and W and Z bosons with fully hadronic decays. As these channels 
are observed using standard analyses, jet substructure techniques can be applied and 
tested. These channels can also serve as key calibration tools for jet substructure 
methods applied in the search for new physics. 

From the theoretical side, improvements in jet-based analyses can come from a 
variety of sources. As calculations in perturbative QCD progress, they can be used to 
improve predictions for jet-based observables in QCD. Improved Monte Carlo tools, 
such as the continued implementation of next-to-leading order matrix elements and 
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better parton showers, will lead to more accurate studies and a better understanding 
of jet physics. Additionally, the SCET framework will improve our understanding 
of QCD jets. As SCET is adapted to describe a wider variety of event topologies 
and realistic jet algorithms are implemented in the effective theory, it can be used to 
calculate resummed predictions [107, 58, 61j for jet-based observables and accurately 
describe processes that are difficult to access with fixed-order perturbative QCD. 
Jets will likely play a central role in new physics searches at the LHC, and a better 
understanding of jets and jet substructure can aid in the discovery process. 
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Chapter 6 

TOOLS TO STUDY JET SUBSTRUCTURE 

An "event" at a hadron collider typically consists of very many^ outgoing particles, 
mostly electrons, photons, and hadrons. Given this multiplicity, calculating cross 
sections differential in the momenta of all outgoing particles is clearly intractable. 
We can perform analytic calculations for suitably inclusive quantities, such as the 
total cross section for a specific process, but we certainly cannot make fine-grained 
predictions of, say, jet substructure. In addition, if we want to simulate the effects of 
the detector, we need a way to produce realistic, high- multiplicity events, either with 
an appropriate distribution or with known weight factors. This is the task of a Monte 
Carlo event generator. 

The output of an event generator is a list of particles and their four-momenta. The 
next step in a realistic analysis is to simulate the output of a particle detector such 
as ATLAS or CMS, given a specific particle-level event. This can involve detailed 
simulation of the interaction of particles passing through the various materials of 
the detector as well as instrumental response, or much cruder approximations where 
particles are grouped together into "calorimeter cells" and assumed to be measured 
with some resolution. 

If the final state involves jets, detector outputs such as calorimeter cells must be 
clustered into jets. As we have seen this can be done in a variety of ways, and in 
general an analysis will involve multiple jet algorithms and "jet manipulations" such 

"'^ Actually, without adding some kinematic restrictions, "how many particles are observed" is not 
a well-defined quantity! The tt Monte Carlo events used in Chapters 4 and 5 typically have ~ 500 
outgoing particles from Pythia, ^ 250 particles with I77I < 5 and pT > 0.5 GeV, and ~ 150 
calorimeter cells with pr > 1 GeV. This includes the effects of the underlying event, but not 
pile- up. 
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as filtering or pruning. Being able to test and compare multiple jet tools at this step 
is essential. Finally, having found jets, as well as other final state objects such as 
isolated leptons, a specific physics analysis can be performed. 

In the following sections I review the individual steps in performing a physics 
study using Monte-Carlo-simulated data, noting at each step the various software 
packages available for that task. In discussing jet finding and analysis, I will pay par- 
ticular attention to the FastJet and SpartyJet packages; I have made significant 
contributions to the development of the latter. 

6.1 Analysis chain overview 

6.1.1 Event generation: ME/ PS /Matching 

A "complete" Monte Carlo event generator can be broken into three parts, typically 
performed by separate computer programs. First, a low-multiplicity "parton-level" 
event is generated, and given a weight corresponding to the exact matrix element 
squared for that process. Processes with hadrons in the initial or final state (e"^e~ — > 
jets or pp — > ZH, for example), are treated as involving some fixed number of quarks 
or gluons (e+e~ — )■ qq or gg — )■ ZH, for example). The matrix elements are calculated 
to some fixed order in a^, often just to tree level; logarithmic resummation can also 
be included at this step. To ensure that events have finite weights, kinematic cuts on 
the outgoing particles are typically required. 

To produce the multiplicity of particles seen in the detector, the matrix-element- 
level generator must be combined with a "parton shower Monte Carlo" , which takes 
outgoing partons (quarks and gluons) and iteratively radiates gluons and splits gluons 
into qq pairs until all particles have energy (or some other scale such as virtuality) 
below some fixed lower scale. This process typically assumes that emissions are inde- 
pendent of each other, and approximates the matrix elements for gluon radiation and 
splitting, making sure to be accurate in the limit that a splitting is soft or coUinear 
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(the singular limits of the matrix element). This obviously does not reproduce the 
QCD shower exactly, but is correct to leading logarithmic precision. 

Without some care, the matrix-element-level generation and the parton shower will 
not cover all of phase space exactly once. Consider an event sample that is represented 
at the parton level as e+e" — )■ qqg. Each of the quarks will radiate gluons as part 
of the parton shower; one of these gluons could end up with the same momentum 
as the gluon produced by the matrix element generator unless we impose some sort 
of restriction one or both Monte Carlos. One solution is to require that partons in 
a matrix-element-lcvcl event be well-separated by some criterion (kx distance, say), 
and that parton shower emissions can never be separated by this much. Generically, 
a method for combining matrix-element generators with parton shower generators is 
called a "matching procedure" . 

One final step is necessary before we can send our events to a detector simulator. 
A parton shower produces a multiplicity of quarks and gluons, but of course these are 
not the particles we observe. QCD is confining, and the outgoing quarks and gluons, 
after showering down to some low energy of order Aqcd, will re-arrange into bound 
states — hadrons. This is a fundamentally non-perturbative process, and the best 
we can do is model it and fit the model to data. Such a hadronization procedure is 
typically included at the end of a parton shower Monte Carlo. 

In hadron collisions, we must also consider the initial state. First, rather than 
generate events with incoming partons of fixed energy, we must include partons with 
arbitrary fractions of the incoming hadrons' momenta and convolute with the proba- 
bility that, at the energy scale involved, we find two partons with those two momentum 
fractions. These probabilities are known as "parton density functions". They cannot 
be calculated perturbatively, although their renormalization group flow can, so after 
measuring their form at some energy scale we can predict them at any other scale. 
We must also, in analogy with the showering of outgoing partons, consider radiation 
from the incoming partons ("initial state radiation"). Finally, the "beam remnants" 
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— the valence and sea quarks from the incoming hadrons that did participate in the 
main interactions — can themselves interact. The output of these interactions is 
known as the "underlying event". To the extent that these "multiple interactions" 
are independent of the rest of the event, they will typically involve low (transverse) 
energy scales, since in the absence of analysis cuts (i.e., "minimum bias") all events 
typically involve low scales, so having two high-energy interactions in a single collision 
is rare. However, note that if the final state is not a color singlet {qq — > g'* — > tt, 
e.g.) the underlying event cannot be completely independent of the primary inter- 
action due to color connections. In fact, it is observed that the underlying event is 
independent to a good approximation [44j. All of these effects can either be incor- 
porated into the parton shower Monte Carlo or generated independently. Note that 
the outgoing quarks and gluons from initial state radiation and the underlying event 
must themselves shower and hadronize. 

A fairly complete database of Monte Carlo event generators is available at the 
CEDAR HepCode page [108j. 

Monte Carlo programs used in this work 

In the studies discussed in this thesis, we use the MadGraph/MadEvent package 
[109J to generate matrix-element-level events. For the pp studies, MLM matching is 
used. Both MLM [IIOJ and CKKW [lllj matching are included in the MG/ME- 
Pythia interface included with the MG/ME package. We use MG/ME's included 
Pythia package (version 6.4 [112j) to shower incoming and outgoing partons, as well 
as generate multiple interactions (the underlying event). Pythia also models the 
hadronization of partons. 

6.1.2 Detector simulation 

After generating particle-level events, sets of output particles should be passed to 
some kind of detector simulator. Very detailed simulators of the detectors for all major 
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particle physics experiments exist (see, e.g. [113J), but these are typically overkill for 
speculative theoretical studies. For these, a general purpose simulator that captures 
the broad features of calorimetry is sufficient: PGS [114J and Delphes [115j are two 
examples. 

In the studies described in this thesis, we have used our own crude detector simu- 
lation, which rejects invisible and outside-of-detector particles, clusters particles into 
calorimeter cells, isolates leptons, and imposes a minimum pt cut on calorimeter cells. 
We have also incorporated Gaussian smearing of calorimeter cell energies to roughly 
model detector resolution effects. 

6.1.3 Jet finding and analysis 

To study events with jets, a jet algorithm must be applied to the outputs of the de- 
tector simulation, typically calorimeter cells. An enormous variety of such algorithms 
exist (see [45j for a survey), all of which have been implemented in software. Histor- 
ically this was done individually by experimental groups and theorists, occasionally 
in subtly different ways (see, e.g., the discussion of seeded cone algorithms in [116]). 
Now the Fast Jet package [21j, is fast becoming standard among jet practitioners. 
FastJet implements most, if not all, commonly used jet algorithms, and through 
a plugin mechanism can be extended to implement other algorithms as well. Many 
FastJet algorithms incorporate insights from computational geometry, making them 
faster than previous implementations. More details on FastJet are given in Sec. 6.2. 

Another tool for studying jets has recently emerged: SpartyJet ([23j, [24j). 
SpartyJet incorporates jet finding with FastJet and adds several useful layers of 
input, analysis, and output, partially based on ROOT [117J. Many analysis compo- 
nents can be glued together with simple Python scripts. More details on SpartyJet 
are given in Sec. 6.3. 

In the studies described in this thesis, we have used SpartyJet for jet finding 
and analysis; the jet algorithms were implemented in FastJet via the SpartyJet 
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wrapper. The plots new to this thesis were all generated from the SpartyJet GUI. 
6.2 Fast Jet 

FastJet is the new standard in jet finding. This section gives a brief overview of its 
capabilities. A more detailed description of FastJet's features, use, and implemen- 
tation is given in the official FastJet manual [118J. At the end of this section I will 
also discuss the FastJet plugin I have written to implement jet pruning in a simple 
and standard way. 

6.2.1 Overview 

The achievement of FastJet is two- fold: First, to standardize the implementation 
of jet algorithms between and among experimentalists and theorists, eliminating the 
possibility of subtle and hidden discrepancies. Second, to bring together in one place 
advances in jet finding technology, for example introducing the technique of Voronoi 
diagrams (see the discussion and references in [21j) for efficient distance finding for 
very large numbers of particles. FastJet also includes several implementations of 
"jet area" finding [90j for arbitrary jet algorithms, which I will not discuss. 

FastJet is a package of C-f-f libraries that implement jet finding and related 
tools. The primary classes are: 

class f ast j e t : : Pseudo Jet ; 
class f ast j et : : JetDef init i on ; 
class f ast j et : : ClusterSequence ; 

PseudoJet is basic four-vector class, adding a pair of indices: one for cluster or- 
dering and one left to the user. JetDef init ion collects the full specification of a 
jet definition, including an algorithm like kx, R and any other parameters neces- 
sary, and a recombination scheme.^ The actual business of jet finding is done by 

recombination scheme specifies how to make PseudoJet p from merged PseudoJets pi and 
p2. To combine four momenta, by far the most common is the "E-scheme", where p = pi + P2- 
The indices on a PseudoJet allow expanded schemes where, for example, the user index tracks 
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the ClusterSequence class. Given a list of PseudoJets and a JetDef inition, a 
ClusterSequence constructs the set of final jets. For recombination algorithms, a 
merging history is also constructed.'^ Both the jets and the clustering history can be 
accessed with a variety of methods: 

// Set up input particles 
vector <Pseudo Jet > inputs; 
// ... fill this vector somehow 

// Set up a jet definition 

Jet Algorithm algorithm = kt_algorithm ; 

double R = 1.0; 

Rec omb inat i onScheme recomb_scheme = E_scheme; 
Strategy strategy = Best; 

JetDef init i on j et_def ( algorithm , R, recomb_scheme , strategy); 
// Get jets and merging history 

ClusterSequence cluster_seq ( inputs , jet_def); 
// ***** Access methods ************** 
// inclusive jets 

vector <Pseudo Jet > inc_jets = c lust er _s eq . inc lus i ve _ j et s (pt_min) ; 
// exclusive jets , with a dcut 

vector <Pseudo Jet > exc_dcut _ j et s = cluster_seq . exclusive. j ets (dcut); 
// exclusive jets , stop at N jets 

vector <Pseudo Jet > exc_jets = c lust er_seq . exc lus ive _j et s (Njets); 

// get constituents of a given jet 
PseudoJet jet = inc_jets[0]; 

vector <Pseudo Jet > consts = clust er_ seq . const ituent s ( j et ) ; 

// look at substructure 

PseudoJet child, parentl, parent2; 

child = j et ; 

while ( cluster_seq . has_parents ( child , parentl, parent2)) { 
child = parent 1 ; 

the parton flavor which the recombination scheme can be designed to propagate. 

•^Actually, Fast Jet stores a merging history for all algorithms, including cone-type algorithms, 
but for the latter the history is not meaningful. 
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} 

// child is now an input particle from jet 

// child = par ent 1 ( par ent 1 ( ... parentl(jet) ... )) 

// Can also go the other way: 
PseudoJet pj = child; 
PseudoJet new_child; 

if ( cluster_seq . has_child (pj , new_child)) { 
// . . . 

} 

// new_child is set to pj ' s child 

Note that in FastJet language, two "parent" pseudojets merge into a "child" 
pseudojet, in contrast to the parent /daughter language used in Sections 4 and 5. 

6.2.2 Built-in versus plugin algorithms 

The set of algorithms that run natively in FastJet are shown in Table 6.1, Note 
that all of the native algorithms are specific cases of the generalized kx algorithm for 
either hadron or e~^e~ collisions. 



Algorithm 


Name 




di 


pp 


kx 


kt_algorithm 


min(p2.,p2.)^^2./^2 


PTi 


Cambridge / Aachen 


Cambridge ^algorithm 


ARyR' 


1 


anti-kx 


antikt_algorithm 


mm{p^lp^])A^/R^ 


PTi 


Generalized kx 


genkt_algorithm 


min{pllpl^^)ARl/R' 


2p 
PTi 


e+e 


kx 


ee_kt .algorithm 






Generalized kx 


ee.genkt .algorithm 


mm[l^^ ) (i_cosii) 


E? 



Table 6.1: Native FastJet algorithms 
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This set of algorithms is implemented internally in FastJet, but a much broader 
(and growing) class of jet algorithms is accessible through the plugin mechanism. A 
FastJet plugin is derived from the abstract base class fast jet : : JetDef inition: : Plugin. 
A plugin defines the run_clustering(ClusterSequence &) function, using an inter- 
nal interface to the passed ClusterSequence. Many algorithms beyond kx variants 
are shipped with FastJet as plugins; here is an example of their use from the Fast- 
Jet manual [118J: 

// have some plugin class derived from the Plugin base class 

class CDFMldPolntPlugln : public f as tj et :: Jet Def Inlt Ion :: Plugin {...}; 

// create an Instance of the CDFMldPolntPlugln class 
CDFMldPolntPlugln cdf _mldpolnt ( [... options ...] ); 

//create the jet definition 

f astj et :: JetDef Inlt 1 on jet_def = f astj et :: JetDef Inltlon ( k cdf _mldpolnt ) ; 

// then create ClusterSequence with the Input particles and jet_def , 
// and use It to extract jets as usual 

For a better idea of how a plugin is actually implemented, see the description of 
the FastPrune plugin in the next subsection. 

A list of plugins available in FastJet is given in Table 6.2, In addition, several 
recent proposals for new jet finding techniques have been accompanied by the release 
of FastJet plugins (e.g., [91j, [20j, and [2J). FastJet's capabilities continue to grow. 
The nature of the plugin mechanism allows arbitrary new jet methods to be plugged 
directly into old analyses with minimal effort. 

6.2.3 The FastPrune plugin 

Having read Sec. 5, the reader is no doubt eager to try jet pruning at home. Rest 
assured, gentle reader: nothing could be easier. I have written FastPrune, a Fast- 
Jet plugin implementing pruning, for just this purpose. The plugin is available online 
[106j. This subsection gives an overview of the plugin's features and use; all code is 
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taken from version 0.4.1. 

Like any FastJet plugin, FastPrune is implemented as a class deriving from 
fastjet: : JetDef inition: :Plugin. The following constructors are available: 

// Basic constructor 

FastPrunePlugin (const JetDef inition & f ind_def init i on , 

const JetDef inition & prune _def init i on , 

const double & zcut = 0.1, 

const double & Rcut_factor = 0.5); 

specify a Recombiner class 
const JetDef inition & f ind_def init i on , 
const JetDef inition & prune_def init i on , 
const Jet De f init ion :: Re combiner * recomb , 
const double & zcut = 0.1, 
const double & Rcut_factor = 0.5); 

// Two new constructors that allow you to pass your own CutSetter. 
// This lets you define zcut and Rcut on a jet-by-jet basis. 
FastPrunePlugin (const JetDef inition & f ind_def init i on , 

const JetDef inition & prune _def init i on , 

CutSetter* const cut_setter); 

FastPrunePlugin (const JetDef inition & f ind_def init i on , 
const JetDef inition & prune_def init i on , 
CutSetter* const cut_setter , 
const Jet Def init ion :: Re combiner * recomb); 

The parameters zcut and Rcut .factor correspond the the parameters Zcut and 
Dent in Sec. 5.1, where the actual Dent used for a given jet is Rcut_f actor x2mj/pTj- 
Two jet definitions need to be passed. The first is used to find initial jets (Step 
in Sec. 5.1). The second is used in the pruning procedure (Step 1), so should be a 
recombination algorithm like CA or kx. The user can specify their own Recombiner, 
for example to preserve flavor information in the merging. Setting the Recombiner 
for the pruning jet definition will have the same effect. The user can also specify a 
CutSetter class, which stores values for zcut and Rcut and implements the function 
SetCuts (const PseudoJet &, const ClusterSequence &). CutSetter, as well as 



// Lets the user 
FastPrunePlugin ( 
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an example Def aultCutSetter are defined in FastPrunePlugin . hh. 

FastPrune works in three stages. First, unpruned jets are found with the 
JetDef inition f ind_def inition. Second, each individual jet and its constituents 
are then passed to a second ClusterSequence using the prune .definition. The 
Recombiner for the pruned JetDef inition is set to be a PrunedRecombiner, a helper 
class that implements the pruning test. It wraps the Recombiner in prune_def inition, 
checking for the pruning test given in Eq. 5.1, If the test fails (i.e., the softer branch 
should be pruned), the recombination does not happen and the index of the pruned 
PseudoJet is stored. Finally, the ClusterSequence built up by this process is trans- 
ferred to the output via the standard plugin interface.^ 

The most important step is the running of the pruned JetDef inition, with its 
PrunedRecombiner. A few notes are in order. Since the jet definition is supplied 
by the user, any algorithm that FastJet knows about can be pruned. Moreover, 
FastPrune doesn't need to implement any actual jet finding since this is outsourced 
to existing FastJet code. Since the only difference between a pruned algorithm and 
the unpruned sort is that some recombinations are vetoed, the same JetDef inition 
can be used — just with a new Recombiner. If the user supplies their own Recombiner, 
this is passed to the plugin's PrunedRecombiner. PrunedRecombiner first checks if a 
recombination should be pruned, then if not does the recombination with the user's 
Recombiner. If no Recombiner is passed, then FastJet's Def aultRecombiner is 
used. Finally, FastPrune preserves the user indices for input PseudoJets, and 
these can be used, for example, by the user's Recombiner class. 

Here is a shortened version of the example program indicating how the plugin is 
used: 



^In the final ClusterSequence, pruned PseudoJets appear in the merging history as steps with 
Invalid children — they are never merged with other PseudoJets or the beam. 
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// setup 

JetDef init ion j et_def ( canibridge_algor ithm , 1.0, E_scheme , Best); 
JetDef init ion j e t _def _b igR ( cambr idge_algor i thm ( ) , 0.5*pi, E_scheme , Best); 
FastPrunePlugin *PRplugin = new FastPrunePlugin ( j et_def , j et_def _bigR , 0.1, 0.5); 
JetDef init ion pruned_def (PRplugin) ; 



vector <Pseudo Jet > inputs; 

/* ... fill inputs somehow ... */ 

// find jets 

ClusterSequence pruned_seq ( inputs , pruned_def); 
// access jets 

vector <Pseudo Jet > pruned_jets = pruned_seq . inclusive. j ets (20 . 0) ; 

/* ... do stuff with jets ... */ 

// can also see which subjets were pruned 

// pr uned_ sub j et s [0] are subjets pruned from highest -pT jet, pruned_subj ets [1] are 
next -highest , etc. 

vector < vector <Pseudo Jet > > pruned_subj ets = PRplugin ->pruned_subj ets () ; 



6.3 SpartyJet 

SpartyJet is a jet analysis package that complements and extends jet finding with 
FastJet. SpartyJet provides a framework for jet finding and analysis that includes 
support for a variety of input and output formats and easy combination of many jet 
manipulation and measurement tools. FastJet is a tool for finding jets; SpartyJet 
is a tool for studying jets. This section gives an overview of SpartyJet's capabilities, 
and is intended to complement the manual, available at [24j. 

6.3.1 Input and output 

SpartyJet can take input particle in put from a variety of sources; the user only 
needs to specify the location of an input file and its format. A full list of possible input 
formats is given in Table 6.3, Configuring input is simple: just create an instance of 
the appropriate input class, typically passing it a file name: 

SpartyJet :: InputMaker *input = new SpartyJet :: StdHeplnput (" events . hep ") ; 
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All input classes derive from Sparty Jet : : InputMaker; an object of this type is passed 
to jet analysis. Several add-ons to input reading are available, including checking for 
bad input (e.g., four-momenta with negative energy) and storing PDG ID codes. 



Format 


Class name 


Description 


ROOT NTuple 


Ntuple InputMaker 


Reads 4- vectors from a TTree 


ASCII text 


StdText Input 


Reads hues of "E px py pz" text 


StdHEP 


StdHepInput 


Reads StdHEP XDR files 


CALCHEP 


Cal chepPart onText Input 


Reads CALCHEP files 


HepMC 


HepMCInput 


Reads HepMC ASCII output 



Table 6.3: Available SpartyJet input formats 



SpartyJet output is stored in a TTree in a ROOT file. Four-momenta for all 
jets (for an arbitrary set of jet finders) are stored, as well as four-momenta for all 
input particles and indices to keep track of which input particles ended up in which 
jet. Complete merging history (as in FastJet's ClusterSequence) storage is stored 
internally, but not written to the output file. Persistency for the clustering history 
is in development. As described below, an arbitrary set of jet "moments" can be 
added to any or all jet finders; the values of these moments are also stored as TTree 
branches. 

6.3.2 Jet algorithms 

Previous versions of SpartyJet offered a large number of native jet algorithms, as 
well as access to a subset of native FastJet algorithms. Most of the native al- 
gorithms are collaboration-specific implementations of cone and kx-type algorithms, 
for example CDF's JetClu. As experiments move to standardized algorithms, and 
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non-standard algorithms are implemented in FastJet, built-in SpartyJet jet al- 
gorithms have become deprecated. Currently, the only SpartyJet native algorithm 
not available through FastJet is an implementation of Pythia's CellJet. With 
version 3.4, SpartyJet can now use any FastJet JetDef inition, including na- 
tive algorithms like kx, included plugins like SISCONE, or user-supplied plugins like 
FastPrune. Any jet algorithm that can be implemented as a FastJet plugin can 
be used with SpartyJet and this is now the preferred method of adding a new jet 
algorithm to SpartyJet. 

Here are some examples of creating jet finders in SpartyJet. Jet finder classes 
derive from the more general JetTool class, about which more will be said in the 
next subsection. 

// *** Old-style jet finders (see exampl e s_C /mult i AlgExample . cc ) *** 
// Add a Midpoint alg 

cdf : : MidPointFinder * tooll = new cdf :: MidPo int Finder () ; 

t ooll - > set _ coneRadius ( . 4) ; // can set all parameters like this 

tooll -> set _name (" MidPoint4 " ) ; 

builder.add_default_alg (tooll) ; 

// Add a Jet Clu alg 

builder . add_default_alg( new cdf : : JetClustFinderC'myJetClu")) ; 

// Add a CellJet alg second parameter turns off constituent storage , 

// which does not work in CellJet 

builder . add_default_alg( new pythia: : CellJetFinder("myCellJet") .false) ; 

// *** New-style (FastJet) finders (see example s_C / FJExampl e . cc ) *** 
// Add an algorithm (AntiKt) - uses the f ast j et :: JetDef init i on :: Jet Algorithm enum 
Fast Jet F inder *anti4 = new Fas t Jet Finder ( " Ant iKt 4 " , ant ikt _algor ithm , . 4 , f al s e ) ; 
builder. add_def ault_alg( an ti4); 

// Same algorithm, uses your own Jet Def init i on 
JetDef init ion j et_def ( antikt_algorithm , 0.4); 

Fast JetFinder *anti4_2 = new Fast Jet F inder (& j et _def ," Ant iKt4_2 ", f alse ) ; 
builder.add_default_alg(anti4_2); 

// More interesting example: FastJet Plugin 
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// Note that SISCone is included in FastJet , but is implemented as a plugin 
// To use your own plugin, you will need to link against the relevant library 
double coneRadius = 0.4, overlapThreshold = 0.75; 
SISConePlugin plugin (coneRadius , overlapThreshold) ; 
JetDef init i on plugin_ j et_def (feplugin) ; 

Fast JetFinder *siscone4 = new Fas t Jet Finder (&plugin_j et _def ," S I SCone4 ", f al se ) ; 
builder. add _default_alg(siscone4); 



6.3.3 JetCollections and JetTools; Constructing an analysis 

The basic object of a SpartyJet analysis is a JetCollection; the basic action 
of an analysis is described by a sequence of JetTools. A JetCollection is just 
a set of Jets together with with a map of jet and event "moments", which can 
represent any measurement on a jet or an event — these are discussed further below. 
A JetCollection also stores the clustering history of the event it represents. 

A JetTool is an abstract base class that operates on a JetCollection: a JetTool 
must define the method JetTool: : execute (JetCollection &). The most impor- 
tant JetTools are jet finders like those seen in the previous subsection. A jet finder 
takes a JetCollection representing a set of input particles and replaces it with a 
JetCollection containing a set of found jets together with their clustering history. 
Other examples include JetPtSelectorTool, which removes all jets failing a pt cut, 
JetMomentTool, an abstract class for tools that calculate and store jet moments for 
the input JetCollection, and MinBiasInserterTool, which adds particles repre- 
senting pile-up events to an input JetCollection. 

A JetAlgorithm in SpartyJet is a sequence of JetTools; a complete analysis 
consists of a set of JetAlgorithms. The key ability of SpartyJet is to provide a 
very simple way to construct and compare multiple analyses, represented as chains 
of JetTools. An interesting example of a complete SpartyJet analysis is given in 
Appendix D, where I compare pruning to top-tagging and mass-drop filtering. 
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6.3.4 SpartyJet/FastJet interoperability 

Recent developments in SpartyJet, in addition to streamlining the use of FastJet 
jet finders, have added the ability to convert back and forth between the main analysis 
objects in each framework: fast jet : :ClusterSequence and SpartyJet : : JetCollection, 
including transfer of clustering history. In practical terms, this means that with min- 
imal wrapping, SpartyJet JetTools can be used in a FASTjET-based analysis and 
likewise FASTjET-based tools can easily be inserted into SPARTYjET-based analyses. 

Wrapping of Fast Jet tools is done via the FastJetTool class, which converts 
a JetCollection to a ClusterSequence, calls execute (ClusterSequence &), and 
finally converts the modified ClusterSequence back to a JetCollection. Derived 
tools then implement some function on a ClusterSequence. 

Tools that use features already implemented in FastJet, e.g. the FastPrune 
tool described in the next section, are naturally written as Fast JetTools. Other 
tools, such as the TopDownPruneTool, which prunes away asymmetric branchings 
(used in several SpartyJet implementations of jet substructure tools), are simpler 
to implement in terms of JetCollections, which are easier to modify in place than 
ClusterSequences. 

FastPruneTool; an example FASTjET-ftasec? tool 

FastPruneTool is a variant of the FastPrune plugin, now included in the Fast- 
Prune package, that is intended to be inserted into a SpartyJet analysis. Instead 
of acting as a FastJet plugin, FastPruneTool inherits from SpartyJet : : Jet Tool. 
Given a JetCollection representing jets found with some jet finder, it returns a 
JetCollection representing the pruned versions of those jets. This simplifies the 
insertion of pruning into an existing analysis. If the analysis compares pruned jets 
to unpruned jets, the pruning tool eliminates the computational effort of finding jets 
twice (relative to using the FastJet plugin, which finds unpruned jets itself). 
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6.3.5 Jet moments 

In addition to storing a set of jets (and their substructure) at each point in an analysis 
chain, SpartyJet stores jet "moments" — arbitrary pieces of additional information 
about each jet. Examples include a PDG ID code, stored as a jet moment for an input 
"jet" , or a jet area, which is calculated by a FastJet jet finder and then stored as 
a jet moment. Moments are implemented via the Moment and JetMomentMap classes. 
Every JetCollection holds a JetMomentMap, which stores a set of moments for each 
jet in the collection. Moments can be saved and retrieved by name, and there can 
be any number of jet moments. Event moments, which do not correspond to any 
particular jet, can be created, stored, and retrieved in a similar manner. Every jet or 
event moment is stored as a branch in the output TTree. 

Jet and event moments are implemented via the JetMoment<T> and EventMoment<T> 
classes, which both inherit from Moment. T can be any basic type or class that ROOT 
has a dictionary for (so it can be stored in the output file). The JetMomentTool tool 
stores a user-supplied Jet Moment <T>-derived object that calculates the given moment 
for each jet in a JetCollection; the tool then stores this in the JetMomentMap for 
that collection. See JetTools/ JetMomentTool .hh for examples. Once a moment has 
been stored, it can be accessed by subsequent tools, e.g. JetMomentSelectorTool, 
which selects jets based on whether a given moment falls within a given range. See 
examples_py/TopTaggerExample . py for an example. 

6.3.6 Substructure tools 

A number of jet substructure tools have recently been introduced to SpartyJet. 
These include tools for jet filtering, "top-down pruning" as in the mass-drop step of 
[15j or the subjet-finding step in top-tagging [18j, and subjet manipulation. Some of 
these tools simply wrap existing Fast Jet tools (the wrapper is necessary so that the 
tool behaves like a JetTool, modifying a JetCollection in place); others are natively 
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implemented in SpartyJet. See the substructure section of the SpartyJet user 
manual, and the scripts in examples_py/ for more examples and details. 

6.3.7 Graphical interface 

SpartyJet contains an (in development) graphical user interface (GUI) for compar- 
ing results for found jets. The developers of SpartyJet hope that in the near future 
this will become a powerful and easy to use tool for visually comparing the results 
of different analyses. An example screenshot is shown in Fig. 6.1. The GUI loads 
a specified output ROOT file and the user can display a variety of plots for one or 
more of the saved JetCollections. For example, a user could quickly plot the jet 
area and a jet shape variable, both calculated and stored as jet moments, for two 
different JetAlgorithms. Both event displays and full-run plots are available, and 
more types of display are planned. 




Figure 6.1: A screenshot of the SpartyJet GUI. 
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Chapter 7 
CONCLUSIONS 

The last few years have seen a prohferation of new theoretical and experimental 
techniques to search for new physics at the Large Hadron Collider. No silver bullets 
have been discovered, and none will be. Many complementary advances will no doubt 
contribute to the most significant results at the LHC. 

SCET 

Physics at the LHC inescapably involves jets. The best possible theoretical description 
of jet physics is therefore indispensable. Soft /coUinear effective theory is proving to 
be a powerful tool in this regard. As shown in Chapter 3, SCET provides a simple 
framework for factorization, and hence resummation of the logarithms arising in each 
separate piece of a calculation. SCET captures of the dominant physics of QCD while 
allowing for systematic improvements to the approximations used. 

For SCET calculations to be useful at the LHC, however, several advances have 
been necessary. First, the effects of strongly-interacting particles in the initial state 
must be taken into account through "beam functions" [58, 119J — essentially the 
application of a jet function to the "beam jet", (see, e.g., [120J). Second, a useful 
calculation at the LHC must be in terms of jets. Whereas event shapes were interesting 
and useful measures of hadronic activity in e+e~ collisions, the environment of a 
high-luminosity hadron collider is less well suited to event measures and it is useful 
to think instead in terms of "jet shapes" . Our work on jet angularity measurements 
(Chapter 3, [1, 25]) is a step in this direction although it does not yet incorporate the 
additional complications of a hadron collider. Other groups have also made progress 
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in incorporating jet algorithms — and jets — into SCET calculations [61, 121J. An 
intriguing alternative involving an event-shape like measure instead of traditional jets 
was presented in [122j . 

What remains is to apply these improved theoretical predictions to specific appli- 
cations. One goal claimed in [25j is the use of angularities in distinguishing quark 
and gluon jets. An obvious extension would be to use jet shapes to distinguish jets 
involved in new physics (top jets, for example) from their QCD backgrounds — as in 
the template overlap method of [lOOj . As theoretical predictions converge with exper- 
imental methods, another challenge is incorporating the effects of jet modifications 
such as filtering-type techniques and pile-up subtraction. One step in this direction 
has been the calculation of non-global logarithms in filtered jets in [123j. 

Pruning 

While one approach to better LHC studies is better QCD predictions, another is to 
simply discard the parts of the event that are hardest to understand. This is the 
essential goal of grooming methods such as jet pruning. Of course this can only 
be done on average, but to some extent this approach allows us to focus on the 
high-energy, perturbative physics we understand well and pull out the signals we 
are interested in. As we saw in Chapters 4 and 5, pruning significantly reduced the 
new-physics-obscuring effects of splash-in from many sources. Pruning also greatly 
reduced the mass of pure QCD jets — typically moving background jets out of the 
signal region. We explored in Chapter 4 the reasons for these improvements. The 
branchings removed by pruning almost never represent the substructure of a heavy 
particle decay but are instead characteristic of QCD radiation or splash-in. Removing 
such branchings tends to clean up the signal and prune back the background. 

As methods for modifying jet substructure have proliferated, it has become clear 
that while they all exploit the same underlying physics, there can be subtle differences 
between methods that will moreover vary between analyses (see, e.g., the comparison 
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of top-tagging methods in [124]). The field awaits a synthesis of such techniques that 
explains these differences. A full theory of jet substructure and filtering methods will 
require integrating our understanding of the QCD parton shower with the effects of 
initial state radiation, the underlying event, and pile-up. 

Software tools 

In the mean time, the experimentalist or phenomenologist is confronted with a surfeit 
of choice in designing a new physics search. Fortunately tools exist for penetrating 
this thicket — pruning it back, as it were. In addition to the variety of jet algo- 
rithms available within the FastJet package, there is a growing number of jet tools 
implemented in software. In the author's estimation the simplest use of these tools 
exists within the SpartyJet package, which provides a framework for assembhng a 
jet analysis from a large — and rapidly increasing — number of jet filtering, mea- 
suring, and selecting tools. The goal of the SpartyJet package, thus far partially 
attained, is to simplify to the greatest extent possible the design and comparison of 
jet analyses. Improvements planned for the near future include greater inclusion of 
proposed jet tools and a more powerful, easier to use graphical interface for studying 
and comparing the final results. 

The Large Hadron Collider, run by some of the most highly funded and technologically 
sophisticated experimental collaborations in the history of science, will nonetheless 
require the advances in prediction, technique, and software that will be provided by 
the theory community. It is humbly hoped that the tools described in this thesis are 
a step in the right direction. 
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Appendix A 

e+e- HADRONS: AN EXAMPLE QCD CALCULATION 

IN DEPTH 

A.l Introduction 

In this appendix I explain in detail how to calculate a{e'^e~ hadrons) to 0{as), 
which is a nice example of a one loop calculation, requiring non-trivial regularization. 
Throughout I will take all masses to be zero, and use dimensional regularization 
to regulate infrared divergences. The NLO diagrams have soft-collinear divergences 
which show up as 1/e poles, which cancel in the final inclusive cross section. 

Some notes on the calculation. I will use Peskin and Schroeder [4] conventions 

throughout, notably a (H ) metric. I take s <^ m|, so I can neglect contributions 

from the Z propagator. These are irrelevant to the consideration of the NLO strong 
correction. The Feynman diagrams at 0{as) arc given in Fig. A.l, 

In the following, I take the initial momenta to be pi and p2, the photon momentum 
to be q (with = s), and the final quark momenta to be ki and k2- For the 
real emission diagram, I label the gluon momentum k^. For this diagram, the non- 
trivial phase space dependence makes it useful to define the scalars Xi = 2ki ■ q/s. 
Note that xi + X2 + = 2. After summing over spins and gluon polarization, the 
cross section only depends on xi and X2- I work in the center of momentum frame 
((Pi +P2T = = iVs, 0, 0, 0)) throughout. 
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A.2 Some general results 

A. 2.1 Factorization of cross section 

For all contributing diagrams, the amplitude is composed of a leptonic part {e'^e~ — > 
7*) and a hadronic part (7* — )■ qq{g)), with the general form 



/ (A.l) 

s 



At tree level, only one diagram contributes, so 



\M\^ = \l^'*L''H;H,. (A.2) 
s 

It is convenient to split the 1/s^ factor between the leptonic and hadronic parts to 
make them dimensionless (for 2^2): 

L^"" = -L"*L'', = -m*H''. (A.3) 

s s 

In this notation, 

\M\^ = L^'''H^,. (A.4) 

I A^p can in general be written in this form, up to electroweak corrections that connect 
the incoming and outgoing particles. The Ward identity (or gauge invariance, or 
current conservation, etc.) guarantees that 

g^L'^'^ = g.L'^'^ - g^i/''^ = q^H'^" = 0. (A.5) 

We're interested in calculating the total cross section, so we will, in the end, integrate 
over the final phase space. This means that after this integration, there are no vectors 
H^^" can depend on other than g^, so we can write: 

dli^H^''' ^ j dli^ i^g'"'' - H'. (A.6) 
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This means that we can re-express the phase space integral of 



Lf^" J dU^(^g''''-^]H' (A.7) 



dU„.H'. 



The Ward identity has been used to discard the q'^q" term going from the second 
to the third hne. n = 2 for the tree-level diagram and the virtual correction; n — 3 
for the real emission. Noting that in d dimensions, g^^g^v — d, 



g,.H^^ = g,. [g^^^-t^]H' 



(A.8) 



= (d-l)H'. 

Defining L = g^^L'^'^ and H = g^^,!!'^^ , we can write 



/ 



dTln\M\^ = -r^L I dU^H. (A.9) 



d-l 

Generically, a 2 — > cross section has the form 

a{2^N) = ^J d\l^\M\\ (A.IO) 
so for the contributions we consider, we can write 

^2(3) = / dll^mH. (A.ll) 

A. 2. 2 Phase space in d dimensions 

Since we are regularizing the calculation by performing it in d dimensions, we must 
work out the phase space factors in arbitrary dimension. 

Two-hody 

The two-body final states have trivial dependence on the phase space variables, so we 

just need to calculate the total integral: 
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rs) J (27r)'^-2 



(27r)'^-2r((rf-l)/2)- 

The //^"'^ factor is inserted to keep the overall dimension correct. The last hne 
uses a standard result for the surface area of an n-sphere.^ Writing d = 4 — 2e, 



/ 



dUo = 



1 / IGnfi \ yvr 



' ^ s J 2r(| - 

1 f^nfi'^y r(i - e 



(A.13) 



Stt V s y r(2 - 2e) ■ 

In the second line, we used the relation r{z)r(z + |) = ^^T{2z). 
Three-body 

In this case the final state contains three vectors can depend on, but one can 

show (it's pretty easy) that all possible scalar products between them can be expressed 
as a function of s and the energy fractions Xi and X2- We need to integrate out the 
other 3{d—l)—2 variables and save the Xi and X2 integrals until we know the integrand. 
The key trick is using the energy-conserving delta function to integrate over an angle, 
not an energy. We start with the trivial integral over the three-momentum delta 
function: 

^The Wikipedia page for "Spherical coordinates" has a number of nice results relating to this. 
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= ^ y i2nr-^2E, i2nr-^2E, 2eJ^^' -E.-E.- \k, + 

(A.14) 

We can split the remaining two integrals into energy (magnitude) and angular 
parts. One angular integral is trivial, but the other will include integrating over the 
delta function, which we write: 



5{^rs - ^Ei) = -E1-E2- \ki + k2\) 

= 5 {^^fs -E1-E2- El + El + 2EiE2 COS (A.15) 



-S{cos9 — cos ^0), 



(A.16) 



E1E2 

where cos 9o is defined by 

^^_E^-E2r-E!-Ei 

cos Oo = 

2E1E2 

^ {1-X,){1~X2) 
X1X2 

Note that £'3 is fixed by Ei and E2. Integrating over the delta function gives a 
theta function that limits us to the physical region in the energy integrals. Returning 
to the phase space integral: 
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y m m ^^^^ \h + 



(27r 



2d-l 



^2(4-d) f E'l-^dEid'^-^niEt^dE2d'^-^n2 1 ^: 



K27r)2'i-i y ^2 E^EiE2 



S{C0S9 — COS ^o) 



8(27r) 



2d-l 



1/ (W^j r((d-i)/2) [J ^ - 

j^-rf) (s\d-^ [ f dxidx2 \ 27r('^-i)/2 
8(27r) 



-1 ^4/ \J (xiXar-V r((ci-l)/2) 

(A. 17) 



For the last angular integral, we need to break a (d — 2)-dimensional angular space 
into one azimuthal angle and the rest: 

j d'^-^Q^ J d'^-^n J d0sm'^-^9, (A.18) 

so: 



d''-'n6{cose-coseo) = { j d^-'nj j desiYf-'e5{cose -coseo) 

27r(<i-2)/2 



j dcosOsin'^ ^ 96{cos9 - cosOo) 



r((d-2)/2) 



2\ («'-4)/2 



r((d-2)/2) \^ V ^1^2 

27r('^-^)/^ ( A{l-x,){l-X2){l-x^) Y~''^'^ 
V{{d-2)/2)\ xlxl ) 

(A.19) 
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Putting it all together, 

. d-3 lT^(d-X)l2 iT^i.d^'^Ml 



J ^ 8(27r)2'^-i 



X 



AJ r((d-i)/2)r((rf-~2)/2) 

dxidx2 \ /4(l-a;i)(l -a;2)(l-X3)V'^~^^^^ 



^2{4-d)^d-3 



(A.20) 



22d-5/2(27r)««-3/2 T{{d - l)/2)r((rf - 2) /2) 
X J dx,dx2 [(1 - xi)(l - X2)(l - xa)]^"-'^/' . 
Again writing c? = 4 — 2e, 

/ = 2lV2-4^(2^)5/2-2e r(| _ g)r(l _ / ^^^^^^ [(1 - Xi)(l - X2){1 - Xa)]"^ 

- 16(L)3%-2ef^ / ^-.^-2[(l-.0(l-2)(l-.3)r. 

(A.21) 

In the last line we have used the relation r(^)r(^ + 1/2) = 2^/^-^^^/27rT{2z). Our 
final result: 

/ = 1287r3 ^2-2e) J ^"^^^"^^ ^(1 - - ^2)(1 - 0:3)]-^ • (A.22) 
Whew! Now we can actually start calculating diagrams. 

A. 3 Tree-level cross section 

We first calculate the tree-level cross section a{e^e^ — > qq). The calculation is identi- 
cal to e+e" A*"*"/^", up to an overall color factor. There are no divergences to worry 
about, so we will go ahead and set 0? = 4 for this part of the calculation. The matrix 
element is: 

iM = [v{P2)iteYHpi)] [uih)iieY)vik2)] . (A.23) 

In the notation of Eqs. A. 3 and A. 4, we can write \M\'^ = L/j^^.M'^'^, with: 

L'^- = - [v{P2)ru{pi)] [u{p^)rv{p2)] ; (A.24) 
s 

H'^" = ^ [u{k2)rv{h)] [v{h)ru{k2)] . (A.25) 
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As shown in Eq. A. 9, we only need g^vL'^" and g^iyH'^'^. We now calculate these, 
summing over final spins and averaging over initial spins. 

o2 



L ^ 9,.L^^ ^ ^Tr 



-2e2 

Pi • Vi 



= -e\ (A.26) 

H = g,,H^^ = ^Tr [hYhl,] 
s 

= -Ae^Q). (A.27) 
Plugging into Eq. A. 11, and using Eq. A. 13, we can find the cross section: 

f"^, (A.28) 



Summing over quark charges and colors, we get our final expression: 

^ (A.29) 

/ 

The astute reader will note that cxo is the total (tree-level, masslcss) cross section for 
e^e~ A*"*"/^"- In the massless limit, the only difference for quarks is the color and 
charge factors. 

A. 4 Virtual correction 

We'll start with the virtual corrections. The two leg corrections involve scaleless 
integrals (there is no dimensionful quantity that the integral over the loop momen- 
tum could depend on). In dimensional regularization, the integrals have non-zero 
dimension and therefore must be equal to zero. The relevant diagram is the tree-level 
diagram with a gluon connecting the quark lines. The 0{as) contribution to the total 
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cross section comes from the interference term 2 9^c(A4^ree-^ virtual)- The matrix ele- 
ments share the same structure on the leptonic side, so L^^" (Eq. A. 24) is unchanged. 
Meanwhile, H^'^ shares one factor with the tree- level calculation (Eq. A.25). We only 
need to calculate the other half. As in Eq. A. 3: 

— ^-"tree virt' 

HZe = v{h){-teQnnu{hy, 



= g eQfH 



p2 



{27ry (A;i+p)2(p-A;2)V 

(A.30) 



We work in d dimensions from the start (hence the /i factor). Summing over final- 
state spins, but not color (we leave an implicit S function in color space, as in the 
tree-level calculation), and using t^t^ = Cj?X: 

Performing the trace requires 7 matrix contractions in d dimensions: 

ria = d (A.32) 

7V7a = -(c^-2)7'' (A.33) 

l^'lHla = 4^"" - (4 - d)7'7' (A.34) 

^a^h^c^d^^ = -27''7^7' + (4 - (i)7'7V- (A.35) 
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Now, we can do the trace, as usual setting d — 4 — 2e ioi simplicity: 

Tr[. . .] = -2Tr ' hhah] + 26 Tr [^rih + f){f - hhah] 

= -2 {l6A;i • {p - k2)k2 ■ {ki +p)- 2eTr [^^^^2] } 

+ 2e {8s{ki +p)-ip- k2) - 4esp^} 
= -2 |l6A;i • (p - k2)k2 ■ {ki + p) - 8e 2p ■ kip ■ k2 - ^p^] } 

+ 2e {8s{ki +p)-{p- k2) - 4esp^] 
= -32fci • (p - k2)k2 ■ (ki+p) - 8e^sp^ 
+ 16e 

!_ - - - ^- 
= 32h ■ {k2 - P)k2 ■ {ki + p) - Se^sp^ 



2p • kip ■ k2 - -p^ + s{ki + p) • (p - k2] 



+ 32e 



p ■ kip ■ k2 + ^{ki - k2) ■ p - ^ 



+ 8esp^ 



= 32fci ■ {k2 - p)k2 ■ {ki + p) + 8(1 - e)esp2 
-32e [(|-p./,i)(|+p./,2) 

= 32A;i ■ {k2 - p)k2 ■ (ki + p) + 8(1 - e)€sp^ 

-32e [ki ■ {k2 - p)k2 ■ {ki + p)] 
= 8(1 - e) [4h ■ {k2 - p)k2 ■ {ki +p) + esp^] 



(A.36) 



In the second line, we have used ^(jL — a? and kf — k^ — 0. We have also used 
ki-k2^ |(A;i + k2y = |. Plugging into Eq. A.31: 



-8i 



2g f d'^p Aki ■ {k2 -p)k2- {ki + p) + esp^ 



(27r)< 



{ki +p)2(p - k2Yp^ 



. (A.37) 



We now introduce Feynman parameters in the standard way: 

2 /■! rl-x 



{ki +p)2(p - A;2)^p^ 



dx 
Jo Jo 



dy 



[x{ki + py + |/(p — k2y + (1 — a; — |/)p^]^ 

(A.38) 



198 



We can rewrite the denominator and shift variables to write it in the form [/ — A]^: 

D = + 2xp ■ ki — 2yp ■ k2 

= + 2p • {xki - yk2) 
l = p+{xki-yk2) (A.39) 
D = f- {xki - yk^f 

— xys 

We can change the integration variable to I because we're integrating over all p and / 
is just an constant additive shift. Now we need to rewrite the numerator in terms of 
I instead of p: 

N = Aki- {k2 - p)k2 ■ {ki + p) + esp^ 

= Aki ■ ((1 - y)k2 — I + xki) k2 ■ ((1 - x)ki + I — yk2) + es {f — 21 ■ {xki — yk2) — xys) 

(A.40) 

The integral over an odd number of factors will vanish by parity, so we can drop 
terms linear in I: 



N = Aki- ((1 - y)k2 - /) /c2 • ((1 - x)ki + l) + es {P - xys) 



(l-x){l-y)j-(k,-l)(k2-l) 



+ es (l^ — xys) 



(A.41) 



Again using symmetry, we can replace l^F — > \l^g^'^ inside the integral (after inte- 
grating, the tensor structure of the integral can only come from g'^'^; contracting with 
Qui, fixes the coefficient): 

„2 



= 4 



(l-x){l-y)j-k^k^2i,l, 
{l-x){l-y)"' 



+ es (/^ — xys) 



r^-l' 

4 2(2 -e) 2 _ 

1 



= [(1 - x){l - y) - exy] s' + 



2-e 



+ es (Z^ — xys) 
+ e sf. 



(A.42) 
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Plugging everything into Eq. A. 37: 
-16i 



rl rl—x 

X dx dy 
Jo Jo 

-m 



dH [(l-x)(l-y) -ea;y]s2+ + 



{2tiY {P + xysf 

g^e^QjCpil - e)i/' I dxdy [CoIo{-xys) + C2l2{-xys)] 



(A.43) 

In the second hne we've separated the simple integrals from their coefficients. I'll just 
pull the standard forms out of Peskin and Schroeder: 

d'^l 1 



^o(A) ^ 



l+e 



/2(A) 



{2TrY (/2 - A)3 

-i r(l + 6) /1_ 
(47r)2-^ 2 \A 

dH P 

(27r)'^ (/2 - A)3 

(47r)2-^ 2 ^ ' VA 

i (2 - e) r(l + e) n 
(47r)2-^ 2 e VA 

-<^A7o(A). 



(A.44) 



Combining the two terms: 

ColQ{-xys) + Cihi-xys) 

= [(1 -x){l-y)- exy\ s^Io{-xys) + 

1 



+ e 



^ {[{I - x){l - y) - exy\ + 



2 - e 

(2-6) 



[{l-x-y + xy)- exy] - xy 

1 



2-e 
1 



s/2(-a:ys) 
{-xys)s \ Io{-xys) 



{1 — X — y) + xy 



3-2e- 



-(2-6) 



s^/o(-a;7/s) 



(A.45) 



s'^Io{-xys) 



{(1 - a; - I/) - ^(1 - 6)(1 - 26)} s2/o(-a;2/5). 
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We can write out Io{—xys): 

l+e 

'I ■ I ■ —I— f- \ I I \ 



—i 


r(i + e) 


(47r)2- 




i 




327r2s 





r(i + 6) 



xy 



Plugging Eq. A. 45 back into Eq. A. 43: 

g^^H^'' = ^g'^e'QjCpil - e)//^^ J dx dy [Cohi-xys) + C2l2{-xys)] 



X J dxdy^{l-x-y) - ^{1 - e)(l - 2e)^ s^Io{-xys) 
= ^S^e^OjC.d - e) (^) ' r(l + e) 

^ - £) (^)' r(l + £)/,„(<;). 

The first integral has a pole; the second is finite but multiplies 1/e so we must 
keep the integral to 0(e). The integrals can be performed using Beta functions: 



{xyy+^ e{xyy 



= ? + 2^+^-y + ^^^)- 



(A.48) 



Before plugging in this form, let's collect all the factors in a^irti using Eqs. A. 11, A. 13, 
A. 26, Note that while L^'^ has not changed, the contraction in Eq. A. 26 has to be 
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modified in d dimensions, giving an extra factor of (1 — e). 
1 1 



Cvirt 



2s d-1 
1 1 



L / dUoH 



2s 3 - 2e ^ ^ ^ ^ Stt 



X 2mc 



s / r(2-2e) 



r(l + e)7virt(e) 



-g^e^QjCp 3 r(l-6)r(l + 6) 
487r3s 3 - 2e r(2 - 2e) 



2\ 2e 



/virt(e)9^e[(-l)^ 



487r3^ 



-r(l - e)r(l + e)/virt(e)^e [(-1)1 i/(e) 



We have pulled out the strange-looking term 

3 1 



H{e) 



2 / 47r/x^ 



2\ 2e 



1 + 0{e) 



(A.49) 



(A.50) 



3-2er(2-2e) 

because this factor will appear in a^eah too. Now we can expand the rest of (Tyirt in 
The only tricky bit is: 



9^e[(-l)1 =!He [e^'""'] = 



1 ± ine — e 



+ 



= 1 - e- 



+ 



(A.51) 



The ± comes from choosing which side of the branch cut to pick, and hence the sign 
of the ie in the propagators we ignored in Eq. A. 31; in the end taking the real part 
lets us ignore this subtlety. With the expansion of the integral and the F functions, 
we have (dropping terms C(e)): 



virt 



3s 



27r 



Hie 



e 

2 _ 3 

"l2 ~ 7 



(A.52) 



In the last line we have summed over flavors and colors; recall the implicit 5 function 
in color space. 
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A. 5 Real emission 



Now that we have calculated the correction to a from a virtual gluon. we need to 
consider the emission of a gluon. There are two diagrams that contribute to the real 
correction: emission of a gluon from either of the quarks. Since the final state is 
distinct from the tree-level and virtual diagrams, there is no interference. So what we 
want to calculate is the sum of the two real diagrams. We follow the same procedure 
as above and break the calculation into leptonic and hadronic parts. L will be the 
same. Adding the two diagrams, we find H: 



+ 



7' 



k2 ■ k3 



-7 



ki ■ ks 



ki ■ k3 



V{k2) 

v{k2); 



s 

^]g'e'Q}Tr{t^t^)el{ks)s^{ks)v{k2) 



7 —J J — 7 - T—j J — 7 

k2 • fts ki- ks 



X u{ki)u{ki) 



{$2 "I" fe) 
ko ■ k^ 



(^1 + ^3 

ki ■ k3 



v{k2). 



(A.53) 



Doing the spin and polarization sums (the last allows the replacement £* (fc3)£^(/c3) 



H = 



4s 



Tr^^ 



7 ^ -■, — 7 -7 ^ -■, — 7 

k2 ■ ks ki- ks 



X 



+ 



7q; 7q; 



(1^1 + 1^3) 



7m 



(A.54) 



A;2 • A;3 /ci • kz 

When the dust settles (no tricks here, just use the contraction formulae), we have: 

[l-e){xl + xl) + 2e{l-X3) 



H = — - — ^^(1 
s 



2e 



(A.55) 



(1-Xi)(l-X2) 

Recall the definitions Xi = 2ki • q/s, where q is the photon momentum; ^3 is fixed by 
the other two. The only dependence on the final-state phase space is on xi and X2- 
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Let's collect the factors in (Jreai, using Eqs. A. 11, A. 22, and A. 26: 

1 1 



'-'^real 



L / dUsH 



2sd-l' 
1 (-(l-e)e^) s {ATTi^ysf' 
2s 3 - 2e 1287r3 r(2 - 2e) 



/ 



dxidx2((l - xi)(l - X2)(l - xs)) 



2e 



g'e'QjCF3{l-ef{A7rf,ys) 
967r3s 3 - 2e r(2 - 2e) 

{l-e){xl + xl) + 2e{l-X3) 

{1-Xi){l-X2) 



X J dxidx2 
~3s 27r 



-2e 



P{xi,X2) 



H[e) I dxidx2 



(l-e)(xf + xi) + 2e(l-X3) 

(1-Xi)(l-X2) 



2e 



0-0- 



r 

27r 



-i/(e)/real(e). 



where P{xi,X2) = [(1 — a;i)(l — X2){1 — x^)]^. The integral is 

/reai(e) = 4 + " + 19/2 - tt' + 0(e). 
All together, this yields (to 0(1)) 



Creal — ^0" 



2n 



H{e) 



4 + - + 19/2-7r^ 



Finally, adding sums over flavor and color, we get 



4 + - + 19/2-7r^ 



1 



P(X1,X2) 



(A.56) 



(A.57) 



(A.58) 



(A.59) 



A. 6 Final result 



We now have our flnal result. Combining Eqs. A. 29, A. 52, and A.59, we flnd: 

as 3C F 



a{e + e > hadrons) = ctq ^ ) Nc 



f 



1 + 



TT 4 



(A.60) 



1 + 



TT J 



In the last line we have inserted the appropriate color factors for SU{3). 
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A. 7 References 

The "pink book" [31j is a good reference for the general ideas here. For the calcu- 
lational details any field text book should suffice; I've made extensive reference to 
Peskin and Schroeder [4J . The "Handbook of Perturbative QCD" [32j is also a useful 
reference. The CTEQ collaboration maintains a website with many useful and in- 
teresting QCD hnks [125J. Of particular note is a similar one-loop calculation of the 
Drell-Yann process by Bjorn Potter [126J. 
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Appendix B 

THE QUARK JET FUNCTION IN SCET 

In this appendix I give an example calculation in soft-coUinear effective theory: 
the quark jet function (Eq. (3.10a)) at next-to-leading order. ^ I repeat Eq. (3.10a) 
here for reference (changing notation slightly): 




'N{J{X„))-1 



X (0| XnA^) \^n) {Xn\ XnA^) 1°) S{tj - r„( J(X„))). (B.l) 

From here on I will drop the "n" subscript on the jet function; the coUinear direction 
will always be n. 

The jet functions can be divided into two categories: those for measured jets, 
which are fixed to have a specific angularity Ta, and those for unmeasured jets, which 
are not. I will denote the quark jet function by J^, where ou is the label momentum, 
and the jet function J^{Ta) with an argument of denotes a measured jet. 1 will 
calculate the jet function for the two classes of jet algorithms, kx-type and cone-type 
algorithms. 

B.l Phase Space Cuts 

To calculate the jet functions for a particular algorithm, we must impose phase space 
restrictions in the matrix element. From the jet function definitions, these cuts take 
two forms. One kind, imposed by the operator (^;v(j^(x„))-i ™ (^-l); common to 
every jet function. It is the set of phase space restrictions related to the jet algorithm, 

^This appendix is taken from Sections 5.1, 5.2, and A.l from ^25J. Additional steps and explana- 
tions have been added. 
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and requires exactly one jet to arise from each collinear sector of SCET. The other, 
imposed by the operator S{Ta — fa), is implemented only on measured jets and restricts 
the kinematics of the cut final states to produce a fixed value of the jet shape. In this 
section we describe these phase space cuts in detail. 



Figure B.l: A representative diagram for the NLO quark and gluon jet functions. 
The incoming momentum is / = |w + and particles in the loop carry momentum 
q ("particle 1") and / - g ("particle 2"). 

The typical form of the NLO diagrams in the jet functions is shown in Fig. B.l, As 
shown in the figure, the momentum flowing through the graph has label momentum 

= n ■ I = u and residual momentum = n ■ I, and the loop momentum is q. We 
will label "particle 1" as the particle in the loop with momentum q and "particle 2" 
as the particle in the loop with momentum / — q. For the quark jet, we take particle 
1 as the emitted gluon and particle 2 as the quark. 

As usual, the total forward scattering matrix element can be written as a sum 
over all cuts. Cutting through the loops corresponds to the interference of two real 
emission diagrams, each with two final state particles, whereas cutting through a lone 
propagator that is connected to a current corresponds to the interference between a 
tree-level diagram and a virtual diagram, each with a single final state particle. Thus, 
the phase space restrictions and measurements we impose act differently depending 
on where the diagrams are cut. In addition, since we will be working in dimensional 
regularization (with d = 4 — 2e), which sets scaleless integrals to zero, the only 
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diagrams that contribute are the cuts through the loops. This means that we only 
need to focus on the form of phase-space restrictions and angularities in the case of 
final states with two particles. 

The regions of phase space for two particles created by cutting through a loop in 
the jet function diagrams can be divided into three contributions: 

1. Both particles are inside the jet. 

2. Particle 1 exits the jet with energy < A. 

3. Particle 2 exits the jet with energy E2 < A. 

In contributions (2) and (3), the jet has only one particle, which is the remaining 
particle with E > A. In principle, an exiting particle could have Ei > A if it entered 
another jet. As long as the jets are all well separated, this contribution is power 
suppressed, since it requires a coUinear particle to be at large angle to the coUinear 
direction. If it were not power suppressed it would break factorization, since a given 
jet function does not know about the directions of other jets — this is one reason we 
require tij ^ 1 (Eq. (3.1)). 

It is well known^ that coUinear integrations of jet functions can be allowed to 
extend over all values of loop momenta so long as a "zero-bin subtraction" is taken 
from the result to avoid double counting the soft region already accounted for in 
the soft function. We will demonstrate that contributions (2) and (3) are power 
suppressed by 0{A/uj), which scales as A^, after the zero-bin subtraction. 

The phase space cuts that enforce both particles to be in the jet depend on the 
jet algorithm. There are two classes of jet algorithm that we consider, cone-type 
algorithms and (inclusive) kT-type algorithms, and all the algorithms in each class 
yield the same phase space cuts. We label the phase space restrictions as ©cone and 

^To those who know it well, of course — e.g., [36 . 
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Gk^, generically Oaig- For the cone-type algorithms, 




.+ 



cone — 



6cone(g,/^) = O (tan^^ > 



CO — q~ 



These G functions demand that both particles are within R of the label direction. For 
the kx-type algorithms, the only restriction is that the relative angle of the particles 
be less than R: 



In the second line we took the coUinear scahng of q {q'^ <^ q~). While this is not 
strictly needed, it makes the calculations significantly simpler. 

For the phase space restrictions of zero-bin subtractions, we take the soft limit of 
the above restrictions (all components of q scale like A^). The zero-bin subtractions 
are the same for all the algorithms we consider. For the case of particle 1, which has 
momentum q, the zero- bin phase space cuts are given by 



For the quark jet function, we don't need a zero bin for particle 2, since the quark is 
never soft. 

For all the jet algorithms we consider, the zero-bin subtractions of the unmeasured 

jet functions are scaleless integrals.'^ However, for the measured jet functions, the zero- 
bin subtractions give nonzero contributions that are needed for the consistency of the 
effective theory. 

In the case of a measured jet, in addition to the phase space restrictions we also 
demand that the jet contributes to the angularity by an amount Tq with the use of 




(B.2) 




(B.3) 



•^Notc that algorithms do exist that give nonzero zero-bin contributions to unmeasured jet func- 
tions _61j. 
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the delta function 6r = 6{Ta — fa), which is given in terms of q and / by 

6n ^ 6n{q, /+) = 5 f - -(a; - q-y/^l^ - q+y-^^' - ^(q-y/^q+y-^A . (B.4) 

\ 00 00 J 

In the zero-bin subtraction of particle 1, the on-shell conditions can be used to write 
the corresponding zero-bin 5-function as 

4°) = 5(r.-i(g-r/^(g+)W2^. (B.5) 
B.2 Quark Jet Function 




(A) (B) (C) (D) 

Figure B.2: Diagrams contributing to the quark jet function. (A) and (B) Wilson 
line emission diagrams; (C) and (D) QCD-like diagrams. 

The diagrams corresponding to the quark jet function are shown in Fig. B.2, The 
fully inclusive quark jet function is defined as 

I dSe^'- (0| xTMxtM |0) = J'il^)^ 
and has been computed to NLO (see, e.g., [127, 128J) and to NNLO [129J. Below we 
compute the quark jet function at NLO with phase space cuts for the jet algorithm 
for both the measured jet, J^{Ta), and the unmeasured jet, J^. As discussed above, 
the only nonzero contributions come from cuts through the loop when both particles 
are inside the jet. 

B.2.1 Measured Quark Jet 

The measured quark jet function includes contributions from naive Wilson line graphs 
(A) and (B) and QCD-like graphs (C) and (D) in Fig. B.2. Using the SCET Feynman 
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rules [12j, the matrix element for graph (A), cut through the loop, is: 

X {-2niq'6{q')e{q')) {-2ni{l - q)H {{I - qfOif - q'))) e,,,Sn 
X S{q')Q{q'^)S {{1 - qf) Q{f - q')e,i,SR 
6{q')Q{q')6 {{I - qf) e(/° - q')e.,,SR. (B.6) 



X 



In the third line we have used the SU{N) identity T^T^ = Cpl, where 1 is the 
identity matrix in color space, Tr(l) = Nq- The last two parcntheticals in the 
(continued) first line represent the cut across the two propagators in the loop. The 
factor of /x^^ is there to ensure the whole expression has the correct dimension. 

Graph (B) is just the reflection of (A) and therefore has the same value. As 
noted in [57], the sum of graphs (C) and (D) is equivalent to the plain QCD diagram, 
bracketed by projections onto the coUinear propagator: Aic + A^d = Pn-MqcDPn- 
This is because we can freely boost to a frame where the momenta in the QCD 
diagram have coUinear scaling. The projected and cut matrix element is thus: 

Disc[A^c.o] |^p4 (^^^'^^') (fef) ^'''^^^^ 0^) (-^^) 

X {-2mq'S{q')e{q')) {-2m{l - qf5 {{I - qf)) 

X S{q')Q{q')S {{I - q)') e(/° - q'^)Q.,M (B.7) 
In the second line we have used several identities involving the coUinear projection 



operators: P„/ = P„ ^cj^ + Z"*"!^ = likewise j(Pn = uj^Pn] and P„ 



1^ Tjl 

2 ~ 21 
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IPfi — ^. The Dirac structure can be simplified as follows: 
l.(/-^)4 = -(<i-2)^(/--'^ 



= -(d-2)f(i--,-)f^ 

= -2(1 -9+)^. 

In the first line we have used a 7-matrix contraction in d dimensions. In the second 
we have used the facts that {if)"^ — and that ^ anticommutes with Putting this 
back into Eq. (B.7) we have: 



Disc[A1c+D] = H'-J (^(8^V)C^1(|^(1 - e){l^ - 

6iq')Qiq')6 ((/ - qf) e(/° - q')e,y,6n. (B.8) 



X 

The total cut matrix element is: 

3d 



X 5(g^)e(?°)5 {{I - qf) Q{f - q')e.i,5R. (B.9) 
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We can now plug this into Eq. (3.10a) to find the full naive quark jet function: 

X {0\Xn,M\Xn) {Xn\ Xn,ui {0)\0) 6{Tj-Ta{J{X^))) 
_ 1 fdl+i 



2Nc 



Tr/|^fDisc.„,.,[M] 



2 2 

X d{q')e{q')6 ((/ - qf) 0(/° - q')Q.i,6n 



27r J {27rY \q- ' ' uj - q- 

X 2nS{q+q- - ql)e{q+)e{q-)2nS f Z+ - q+ ^ 

V u — q 

X - q-)Q{l+ - q+)Q,igSR. (B.IO) 

The trace in the second line is over Dirac and color indices. The contribution pro- 
portional to 1 — e comes from the QCD-hke graphs (C) and (D) in Fig. B.2, Only 
the Wilson line graphs have a nonzero zero-bin limit, which comes from taking the 
scaling limit ^ ~ of the naive contribution: 



(B.ll) 



X 



2n5{l^-q^)Q{l^-q^)Q^^/^\ 



All jet algorithms that we use yield the same zero-bin contribution, since the phase 

space cuts arc the same. 

To evaluate these integrals, we can start with the trivial integral over the 5 
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function (note that the factor enforces q\ = q^q ): 



dl+ 1 f d-^q fAl+ J^-q 



J'M =9'CfI^'' K-77T^o 77^ — + 2(1 - 6) 



J {27rY 



+ 



277 {l+y J (27r)<^ \q~ ' ' u - q- 

X 27r5(g+g- - gl)e(g+)e(g-)27r<5 (t - g+ - ^^"j 6(0; - q-)Q{l+ - q^)Qai^6R 



(27r)'^ \ / \?~(<^~?~) (f^ ~ ?~)^ 

X 27r5(?+g- - ?l)e(g+)e(?-)e(a; - g-)eaig5^. 



It is easiest to spht the q integral into hght-cone components: 

'-dq'^dq~d'^~'^q±_ 



(27r)'' 2 

= -dq+dq-Q'^-^q'[-^dq^ 

= \dq+dq-Q}-^'q'-^'''dqi_ 

1 , + , - TT^-- dgj 
= 2^^ fXl^^- 

In the second hne we have integrated out the d — 3 angles of the q_L subspace, which 
do not appear in the integrand. Returning to the full integral: 

2e 1 /" dq^dq^ fuj — q~\^f 2uq~^ ^ ^ q~q'^ 



r(l - e) (27r)3-2^ y {q+q-y \ uq+ J \q-{u-q-) ' ' {u - q-f 
X e(g+)e(g-)e(a;-g-)eaig(5« 

" \-7f) W^) l^L ^ (2(1 - + (1 - ) 



2n \ J r(l - e) Jo x^+' Jo y^+^ 

In the last line we have introduced scaled variables x = q~ /ou and y = q'^/ou. To go 
further we must plug in explicit forms for ©aig and Sr. Note that all we have needed 
to know so far is that they are both independent of the direction of q^. For now we 
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will consider the case of a cone-type algorithm: 



©rone = © tan^ — > — © tan^ — > 



2 q~ J \ 2 uj — q" 
n R q~^\ ^ f o R q^q' 



= © tan' - > — © tan' - > 



2 q-J \ 2 {co-q-y 
= ©(tan'?>^^©ftan^^> 



2 xj V 2 (1-x)^ 
Meanwhile, the Ta-enforcing S function is: 

= 5 (r„ - (1 - _ 
= 5 (t„ - (xt/)^-"/^ ((1 - xT-' - {xr-')) . 
Putting this all together, we have: 

j'^M =^ f ^^y t^.r^. (2(1 + (1 - ^)^') 



27r V r(l - e) Jo Jo 1/^+' 

x©(tan'^>^^©rtan^?> 



xJ \ 2 (l-x)2 

x5(r„-(xy)i-«/2((i_:,)"-i_(,^)-i)) 



27r V w 



X © (r > ^^^) 5 (r„ - {xyy-'^/' ((1 - x)""^ - (x)""^)) , 
using the abbreviation r = tan^(i?/2). Doing the y integral over the S function: 

^ /" f r^) (^°-' + (1 - ^r^)^' 



27r \ uj'^ J r(l — e) Jo Vl~'^/2/ r^-''/^ 
X (2'-^ + (1 - e)x) © > ^) , 



where /cone (2^) is defined as 

(x"-^ + (1 - x)"-^) x<l/2 
{1 - x f-" {x"'^ + {1 - x)"-^) x>l/2. 



fo 
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The integration region is plotted in Fig. B.3, We can exploit the symmetry of the 
© function around x — 1/2 and rewrite the x integral as being from to 1/2: 

(^) ^ (^7.) f: ^ (1 - ^)-) 

To evaluate the remaining integral, we can analytically extract the coefficient of 
(5(T(j) by integrating over and using the fact that the remainder is a plus distribution. 
We define plus distributions as [58] : 

d r 

[e{x)g{x)]+^\im—[Q{x-e)G{x)], with G{x) ^ dx'g{x'), (B.13) 

^ ^0 cl x J 

defined so as to satisfy the boundary condition dx[Q{x)g{x)]+ = 0. If we write 
A 



Ta 

agCp [ AniJi^\ 1 



+ 

,2~ 



27r \ u'^ ) r(l -e) Vl -«/2. 

- f l-x 



X / (x"-^ + (1 - xf-^) ^ ("2^ — - + 2-^ + (1 - e)) 
Jo \ X \ — X J 

The Ta integral is then simple: 

/ dTa —Q ifconeix) > . " ,^ = / dTa — 



where t^^(x) = "'^^ fconeix) ■ This leaves 
cksCf 47r//^^^ 1 1 



27r V r(l-e)e 

-1/2 



X 



/ + (1 - xy-^) ^ ("2^ — - + 2-^ + (1 - e)) 

Jo \ X 1 — X J 

X [r^-"/'x'-" (x"-^ + (1 - x)"-^)] ^ 

a^C^ fA'Kfi'^y 1 1 f^/^ dx f^l-x „ X 



fAnfi^Y 1 1 r'^ d^f 

27r V ^a;2 J r(l - e) e X V ^ "l-x 



+ 2— + (l-e) . 
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In the X integral, only the 2/x^"'"^^ term diverges as e — )■ 0, and this term can be 
easily integrated exactly. The rest of the terms can be expanded to 0{e) and then 
integrated. The result is 



27r V rcc;2 ) Y(\ - e) \ ' 2e ' 1 3 



1 3 7 vr^ 

+ ;^ + ;t - — + 31n2 



(B.14) 



We can find the rest of Jlor^Jy^a) by taking > in Eq. (B.12), which enforces a 
lower cutoff in the x integral. This renders the whole integration finite and we can 
take e — >■ 0. This yields 



B- 



1 



Ta 
OLsCf 

27r ^r^a/2 

OCgCp ( 1 

27r 



1 - a/2 



1/2 



3^COne 



2 In 



X 1- X J Ta 



+ 



Xr. 



2 / T„ 



+ 



where /cone( 



^1 — a/2 • 



equation has no solution for Ta > r" 
All together the naive contribution is 



(B.15) 

The upper cutoff 9(t™'^ — Ta) appears because this 
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where for cone-type algorithms we have found 
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The only difference between the jet algorithms that we consider resides in the 
finite distribution J^g(Ta). We have calculated this piece explicitly for cone- type 
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algorithms, and give the result for kT-type algorithms below. Note that the divergent 
part of the naive contribution is proportional to 5{Ta)- This is due to the fact that the 
jet algorithm regulates the distribution for Tq > 0. The divergent plus distributions 
come entirely from the zero-bin subtraction, to which we now turn. 

The zero-bin subtraction for the quark jet function is given by Eq. (B.ll), which 
we can evaluate similarly to the naive result: 
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This can be broken into S{Ta) and plus distribution pieces using the relation 
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valid for e < 0. 

Adding the leading-order contribution to all of the NLO graphs and expanding 
in powers of e, adopting the MS scheme (i.e., taking /x^ ^^''^^)' total 
quark jet function 
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This agrees with the standard jet function J{k^) given in [127, 128J by setting a — 
and k'^ — ouTa- We have shown the divergent terms exphcitly, and collect the finite 
pieces in Jfig(Ta), given below. Note that there is no jet algorithm dependence in the 
divergent parts of the jet function at this order in perturbation theory. 

Finite Parts of the Measured Quark Jet Function 

Having found Jconei'^a) explicitly, we merely quote the result for j^^(Ta), which can 
be found similarly: 
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where TZ is the region in x where the constraint 

/k^(x) = x2-«(l - + (1 - > 

ran" ^ 

is satisfied. We plot this region in Fig. B.3B and C for the cases a > —1 and a < — 1, 

repsectively. The boundaries of this region are the points Xi^2 illustrated in the figure, 
and are given by the equation 

A.(^i,2) = (B.21) 

where we take X2 > xi if X2 exists. The upper limit t^^^ is given by the maximum 
value over X of the right-hand side of Eq. (B.2.1). In general, the constraint Eq. (B.2.1) 
is symmetric about x — ^, and so the region TZ is symmetric about the same point. 

In general, if a > — 1 or < 2"^^ tan*^^^"-* ^, then 7^ is a single range in x. Otherwise, 
TZ is two disjoint ranges in x. Since Ta > 2"~^ tan*^^~") ^ can only occur for a < — 1, 
we can write I^^ as 
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{A) (B) (C) 

Figure B.3: Regions of integration for the (A) cone and kx-type algorithms for (B) 
a > — 1 and (C) a < —1. The allowed region of x is when the (blue) functions 
/cone,kT(3^) lis above the (red) lines of constant r^/ tan^^"'^) _R/2. When a < —1 for 
the kx algorithm, there are two regions of integration when > 2"^^ tan^^~") R/2. 



Note that X^^ involves the same integrand as in Eq. (B.15), but for k^-type algo- 
rithms the integral is over a different range. In addition, both Xcone and Xi approach 
the same limiting value for small Ta, 
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Thus, we can extract the small Ta behavior of both distributions by writing 
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where x = Xcone or xi for the cone and kx algorithms, respectively. Defining 
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rg(x) = 3x + 2 m , 
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using Eq. (B.2.1), and including the zero-bin subtraction in Eq. (B.18), we find that 



220 



the finite distributions of tlie full measured quark jet functions are 
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For a = 0, these expressions for the jet functions can be simplified further to give 
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for the cone jet function, and 
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for the kx jet function. In Eq. (B.24b), Xi is given by its value for a — 0, 

(B.25) 




In Eq. (B.24), we have divided the cone and kx jet functions into the contribution 
J^j,i(to) to the inclusive jet function [127, 128J, given by 



To V 2 Uj'^T 



(B.'26) 



and algorithm-dependent parts. The algorithm-dependent part of the a = cone jet 
function Eq. (B.24a) agrees with [121J. Note that if one takes R to be parametri- 
cally larger than tq (cf. Sec. 3.6 and Eq. (3.35)), the algorithm-dependent parts of 
Eq. (B.24) are power suppressed, and the cone and kx jet functions reduce to the 
inclusive jet function. 

B.2.2 Gluon Outside Measured Quark Jet 

In this section we calculate the contribution to the quark jet function from the region 
of phase space in which the gluon exits the jet carrying an energy Eg < A. This cut 
causes the contribution to be power suppressed by A/u, which scales as A^. However, 
we elect to evaluate this case explicitly as it provides a clear example of the zero-bin 
subtraction giving the proper scaling to the total contribution. We only evaluate 
this contribution for the cone algorithm; the details of the kx algorithm calculation 
are similar. Note that the contribution when the quark is out of the jet is power 
suppressed at the level of the Lagrangian given in 2.3.1, in which soft quarks do not 
couple to coUinear partons at leading order in A. 

For the cone algorithm, the gluon exits the jet when the angle between the jet axis, 
n, and the gluon is greater than R. When the gluon is not in the jet, the cone axis is 
the quark direction, and so it makes no contribution to the angularity. Therefore, this 
region of phase space contributes only to the S{Ta) part of the angularity distribution. 
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For the naive contributions, requiring the gluon to be outside the jet and have 
energy less than A, we have the integral 

X e{q-)e{q+)27Td (^1+ -q+ - e{uj - q-)e{l+ - q+) 

X e (^^ - tan^ e (2A - q-) 5(r„). (B.27) 

This is simply Eq. (B.IO) with different phase space © functions and Sr replaced by 
S{Ta)- Note that the theta function requiring q^ < 2A is more restrictive than q^ < u. 
Evaluating Eq. (B.27) yields a contribution that scales with A only below the leading 
term in 1/e: 

>out. . _ ^sC, 1 / 4V y n 1 [AK 2A^\ , 8A 
-J. (.ra) - 2^ i (2Atanf)2 i '^"""^ [e^ + e 



(B.28) 

The zero-bin subtraction of Eq. B.27 is 

X e(g-)e(g+)27r(5 (/+ - q+) © - tan^ f ) © (2A - q') 5(Ta). 

(B.29) 

Evaluating Eq. (B.29), we find the zero bin will exactly remove the leading term in 
1/6: 

™,out(0)/ ^ _ _^£^ \ ( 47r^^ \ 

27r r(l-6) l^(2Atanf)2j ^^"^^^62- ^^""^"^ 

Therefore, the difi^erence is power suppressed only after the zero bin is included. 

Because other contributions when one particle is outside of the jet are similarly power 

suppressed, we will drop them in our remaining discussion of the jet functions. 
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B.2.3 Unmeasured Quark Jet 

When the angularity of a jet is not measured, the jet function has no dependence. 
The naive and zero-bin contributions are the same as Eqs. (B.IO) and (B.ll) except 
for the factor of Sr. The zero-bin contribution is 

J 27r /+ J {2Trfq ^-q 

x2nS (z+-g+)e(/+-g+)ei?). 

This integral is scaleless and therefore equal to in dimensional regularization. This 
implies that the NLO part of the quark jet function for an unmeasured jet is just the 
naive result. We find, making the divergent part explicit, in the MS scheme, 

where the finite parts J^jg are^ 

J1 - ^ In ( 1 + ^ inM (I 1 + rf^'^'s 



2 / 2 



with the constant terms 



^,,cone ^c4l + 3\n2-^-^] , - =cJ^-^-^). (B.33) 



^The unmcasiircd jet function Eq. (B.32) is not simply obtained by integrating the measured 
jet function Eq. (B.19) over Tq. This is due to the different relative scaling of R with the SCET 
expansion parameter A, in a measured and unmeasured jet sector, as noted earlier. Namely, 
i? ^ A° in a measured jet sector (where A ~ ^/Ta) while A^ ~ tan(i?/2) in an unmeasured jet 
sector. 
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Appendix C 
OUR ANALYSIS IN DETAIL 

I here give a brief summary of the computational tools employed to do the studies 
in this thesis.^ We simulate high-energy collisions using MadGraph/MadEvent v4.4.21 
[109J interfaced with Pythia v6.4 [112j. From the hadron-level output of Pythia, 
we group final-state particles into "cells" based on the segmentation of the ATLAS 
hadronic calorimeter (A77 = 0.1, — 0.1 in the central region). We sum the four- 
momenta of all particles in each cell and rescale the resulting three-momentum to make 
the cell massless. After a threshold cut on the cell energy of 1 GcV, cells become the 
inputs to the jet algorithm. Our implementation of recombination algorithms uses 
Fast Jet [21J interfaced with SpartyJet. 

Several of the plots in Sections 4 and 5 involve mass cuts on jets. The details of 
these cuts are provided in Sec. 5.4, 

C.l e+e~ events 

For the e+e~ studies in Sec. 4.2, we generate e+e~ — >■ qq and e+e~ — >■ tt events with 
center of mass energy Q = 1200 GeV. In the tt events, the top quarks are required to 
decay hadronically. We then apply the same minimal detector simulation and analysis 
as for our simulated LHC events — we are only considering e"'"e" collisions as a way to 
study jets without the effect of initial state radiation, multiple interactions, pile-up, 
etc., although of course e+e~ collisions are interesting in their own right. The center 
of mass energy has been chosen so that the pT distribution of the jets is similar to 
that for our second pt bin pp ti sample below. The two distributions are shown in 

^ Parts of this appendix are taken from Appendix A of [2 . 
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Fig. C.l, Note that whereas jets in the pp sample have a falhng p^ distribution with 
a lower cutoff, jets in the e'^e~ sample have a natural upper cutoff, along with the 
same imposed lower cutoff. 




500 550 600 650 700 

JetpT 



Figure C.l: Distribution in p^ for top quark jets in the e~^e sample (red) and the pp 
sample (blue). 

C.2 pp events 

We also study jets in pp collisions. We employ MLM-style matching, implemented in 
MadGraph (see, e.g., [130J), on the backgrounds. We have checked that our matching 
parameters are reasonable using the tool MatchChecker [131J. We use the DWT tune 
[132J in Pythia to give a "noisy" underlying event (UE). For the hadron-level studies 
in Sec. 4.2, we exclude (include) the underlying event by setting the Pythia parameter 
MSTP(81) to zero (one), turning off (on) multiple interactions. To exclude (include) 
initial state radiation, we set MSTP(61) to zero (one). Both ISR and UE are on unless 
otherwise noted. 
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We perform no detector simulation, other than the calorimeter clustering noted 
above, so we can isolate the "best case" effects of our method. In Sec. 5.5.7, we 
examine the effects of Gaussian smearing on the energies of final state particles from 
Pythia to get a sense for how much the results may change with a detector. 

For the W study, the signal sample is W^W^ pair production, with exactly one W 
required to decay leptonically. The background is a matched sample of a leptonically 
decaying W and one or two light partons (gluons and the four lightest quarks) before 
showering. These partons must be in the central region, \'q\ < 2.5. Signal and 
background samples are divided into four px bins: [125, 200], [200, 275], [275, 350], 
and [350, 425] (all in GeV). Each bin is defined by a pr cut that is applied to single 
jets in the analysis. These bins confine the W boost to a narrow range and allow us 
to study the performance of pruning as the jet pr (or W boost) varies. 

For each pr bin both samples are generated with a pr cut on the 

leptonic W of p™'" — 25 GeV. For the background, we set the matching scales 
(Q^f,Qmatch) to be (10, 15) GcV in all four bins. 

For the top quark reconstruction study, the signal sample is tt production with 
fully hadronic decays. The background is a matched sample of QCD multijet produc- 
tion with two, three, or four light partons, with the same cut on parton centrality as 
in the W study. Samples are again divided into four pr bins: [200, 500] , [500, 700] , 
[700, 900], and [900, 1100] (all in GeV). 

We generate signal and background samples with a parton-level hx cut for gen- 
eration efficiency, where hx is the scalar sum of all px in the event. For each px bin 
the parton-level hx cut is pf"" - 25 GeV < hx/2 < p^^ + 100 GeV. For 
the background, we use matching scales (20, 30) GeV for the smallest px bin and (50, 
70) GeV in the other three bins. 
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C.2.1 Matched vs. unmatched samples 

We use matched samples for our QCD backgrounds — that is, samples where the full 
matrix clement weighting is used for additional partons in the hard process. This 
gives background samples with somewhat heavier mass distributions and "harder" 
substructure. Large jet masses and significant substructure are perturbative effects, 
and are enhanced by including the full matrix elements. We expect that substructure 
predictions made with matched backgrounds will be more reliable. 

As an example, consider the plots in Fig. C.2, Three samples are compared: 
"dijet" refers to showered 2^2 processes. The "matched" sample is the sample 
used throughout the paper and described above, with matrix elements for two, three, 
and four hard partons. The "unmatched" sample has the same set of matrix elements, 
but with no matching — i.e., no attempt is made to remove double counting. That 
the mass spectrum is much harder than either of the other samples suggests that the 
double counting is significant. 

All three samples use the same MadGraph phase space cuts: {xqcut > 50 GeV, 
ht jmin > 950 GeV}, corresponding to the second pr bin of the top quark background 
samples. The first cut requires partons to be separated by 50 GeV in kx distance, 
and to each have pr > 50 GeV as well. The second requires that XI IPtI > 950 GeV, 
where the sum is over all partons. For the dijet sample the first cut has no effect. 

The distributions in Fig. C.2 are individually normahzed to unit integral. The 
leading order cross sections are given in Table C.l, 

The important comparison is between the dijet sample and the matched sample. 
The matched sample has a slightly harder mass spectrum, even more noticeable when 
we scale by pj^^. In the lower left we see that the distribution in ai, the measure 
of subjet mass used repeatedly in this thesis does not change much. However, in 
the lower left I show another variable, inspired by the CMS top tagger [133j. The 
"minimum subjet mass" is defined to be the minimum pairwise mass between sub jets 
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Figure C.2: Distribution in mjet, m-^et/vjet^ '^i; "minimum subjet mass", ai is 
the mass of the heavier subjet scaled to the jet mass; the "minimum subjet mass" is 
the minimum pairwise mass between subjets if the jet is unclustered to three subjets. 
Jets have px > 500 GeV. 
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Sample 


LO cross section (pb) 


dijet 


788.5 ± 0.5 


2-4 parton unmatched 


3424 ± 2 


2-4 parton matched 


964 ±2 



Table C.l: Leading order cross sections for the three samples in Fig. C.2. Note the 

extreme overcounting if we include additional hard partons but do not match. The 
cross sections are taken from MadGraph and include statistical errors. 

if the jet is unclustered to three subjets (by undoing the last two clustering steps). 
In addition to the CMS top tagger, this variable is used in the pruning top tagger 
described in [134J. We see that the matched sample has significantly more jets with 
large minimum sub jet mass. 

The lesson is clear: the details of jet substructure seen in simulated events de- 
pend heavily on the details of the Monte Carlo modehng. Since jet substructure is 
fundamentally a higher-order effect, it is natural that higher-order simulation makes 
a difference. 
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Appendix D 
S PARTY Jet example 

In this appendix I give two brief examples of SpartyJet analyses, with the goal 
of comparing pruning to top-tagging for top finding and comparing pruning to the 
mass-drop filter method in W finding. I will first walk through the implementation 
to demonstrate the construction of a SpartyJet analysis, then show some results. 

D.l Implementation 

Both analyses use the following simple wrapper function that handles input and out- 
put, setting up a few input selector tools: 

def RunAlgor ithms ( inf ile , outfile, jetAlgs, pTCut = 50 , N = -1): 

This function wraps the algorithm -running functionality of SpartyJet. 

Inf ile is assumed to be in ' Husky Input ' format. N events are processed; 

N = -1 is all events. 
Output is stored in outf ile . root . 

jetTools must be a list (or iterable container) of SpartyJet JetTools ; 

specifically, these should be jet finders. 
pTCut is the final pT cut on jets. 
II II II 



# Create a jet builder 

builder = S J . JetBuilder () 

bui Ider . s ilent _mode ( ) # turns off debugging information 

# Configure input 

if (inf ile . f ind ( 'UW ) != -1): 

input = S J . Husky Input ( inf i le ) 
elif ( inf ile . find ( 'hep ' ) != -1): 
input = S J . StdHeplnput ( inf ile ) 
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else : 

print 'Unrecognized input format in', infile 
exit (1) 

builder . conf igure _ input (input) 

# Configure output 

#builder . add_text_output (outfile+" . dat ") 

builder. configure_output("SpartyJet_Tree", out file); 

builder . output_var_style . array_type = "vector" # output as "array" or "vector" 
builder . output_var_style . base_type = "float" # output as "float" or "double" 

for t in jetAlgs: builder . add_cust oiii_alg (t ) 

# Add input cuts 

builder . add_ j etTool_f r ont (SJ.JetPtSelectorTool(0.5)) 

builder . add_ j etTool_f r ont (SJ.JetEtaCentralSelectorTool(-4.9,4.9)) 

# Add output cuts 

builder . add_ jet Tool (SJ.JetPtSelectorTool (pTCut) ) 

builder . add_ j etTool (SJ . JetEtaCentralSelectorTool (-2.5 ,2. 5) ) 

# Add jet moments 

Subj etMassMoment = SJ . HeavierSubj etMass ( ' subj etM ' ) 

builder . add_jetTool (SJ . JetMomentTool ( ' subjetM ' , SubjetMassMoment ) ) 
Subj etMassMoment = SJ . HeavierSubj etMass (' al ' , True) # scale to jet mass 
builder. add_jetTool(SJ.JetMomentTool('al' , SubjetMassMoment)) 
zMoment = S J . zMoment ( ' z ' ) 

builder. add_jetTool(SJ.JetMomentTool('z' , zMoment)) 
DeltaRMoment = S J . DeltaRMoment ( ' DeltaR ' ) 

builder . add_j etTool (SJ.JetMomentTool('DeltaR' , DeltaRMoment ) ) 

# Run SpartyJet 
if N > 0: 

builder .print_event_every (max (1, N/20)) 
else : # process all is N = -1 

builder . pr i nt _ event _ every (1000) 
builder . process_events (N) 

The main input is a set of JetAlgorithms. These are defined for the top and W 
analyses by the following functions: 

def TopCompar e Analy s i s ( inf ile , outfile, N = -l) : 
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algs = {} 

# set up initial antikt 

algs [ ' AntiKtlO ' ] = S J . Jet Algorithm ( S J . Fast Jet . Fas t Jet Finder (' Ant iKt 1 ' , f j . 

antikt_algorithm , 1.0, False)) 
algs [ ' AntiKtlO ' ] . addTool (SJ. JetPtSelectorTool (500) ) 
initialJets = S J . ForkToolPar ent ( ' Ant iKt lOPar ent ' ) 
algs ['AntiKtlO '] . addTool (initialJets) 

# recluster with CA , fork again 

algsL'CAlO'] = SJ . Jet Algorithm (SJ . ForkToolChild ( initial Jets , 'CAIO')) 
algs [ ' CAIO ' ] . addTool (SJ. Fas tJet.FastJet Re cluster (■ CAlOcluster ' , f j . 

cambridge_algorithm , 1.5, False)) 
CAjets = SJ . ForkToolParent ( ' CAlOParent ' ) 
algs['CA10'] . addTool (CAj ets ) 

# JH tagger 

algs [ ' CAIOJH '] = SJ . JetAlgorithm (SJ . ForkToolChild (CAjets , 'CAIOJH')) 

algs [' CAIO JH '] . addTool (SJ.FastJet. TopTaggerTool (f j . JHTopTagger ) (f j . JHTopTagger (0.1, 
0.19, 81.0))) 

# Alternative, more aggressive JH tagger 

algs [ ' CA10JH2 ' ] = S J . Jet Algorithm ( SJ . ForkToolChi Id ( CAj ets , 'CA10JH2')) 
JHPrune = S J . JHPruneTool (0 . 1 , 0.19, 2) 
algs ['CAiOJH2 '] . addTool ( JHPrune ) 

algs [' CA10JH2 addTool (SJ . SubjetCutTool (JHPrune , 3, True)) 
algs [ ' CA10JH2 ' ] . addTool (SJ . MinMassTool () ) 

# pruning 

algs [ ' CAlOprune ' ] = S J . Jet Algorithm ( SJ . For kToo IChi Id ( CAj et s , ' C A 1 Oprune ' ) ) 
big_CA_def = f j . JetDef init i on ( f j . c ambr idge _algor it hm , 3.14*0.5) 
algs [ ' CAlOprune ' ] . addTool (SJ.FastJet . FastPruneTool (big_CA_def ) ) 

RunAlgor ithms ( inf ile , outfile, algs . values () , 500, M) 

def WCompareAnalysis ( inf ile , outfile, N=-l) : 
algs = {} 

# set up initial antikt 

algs [' AntiKtlO ' ] = S J . Jet Algorithm ( SJ . Fast Jet . Fas t Jet Finder (' Ant iKt 1 ' , f j . 

antikt_algorithm , 1.0, False)) 
algs [ ' AntiKtlO ' ] . addTool (SJ. JetPtSelectorTool (200) ) 
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initialJets = S J . ForkToolPar ent ( ' Ant iKt lOPar ent ' ) 
algs [ ' Ant iKt 10 ' ] . addTool (initialJets) 

# recluster with CA , fork again 

algs['CA10'] = SJ . JetAlgorithmCSJ . ForkToolChildCinitialJets , 'CAIO')) 
algs [ ' CAIO ' ] . addTool (SJ.FastJet.FastJetRecluster('CA10cluster' , fj . 

cambr idge_algorithiii , 1.5, False)) 
CAjets = S J . ForkToolParent ( ' CAlOParent ' ) 
algs [' CAIO '] . addTool ( CAj ets ) 

# MDF analysis 

algs [ ' CAIOMDF '] = S J . Jet Algorithm ( S J . For kToolChi Id ( CAj et s , 'CAIOMDF')) 
subjetFinder = S J . MassDropTool (0 . 67 , 0.09, 1, 'MassDrop') 
algs [ ' CAIOMDF ' ] . addTool (subjetFinder) 

algs [ ' CAIOMDF ' ] . addTool (SJ . Subj et Cut Tool ( subj etFinder , 2) ) 

algs [' CAIOMDF ']. addTool (SJ . Fast Jet . BDRSFilterTool (1 . 2 , 0.3, 3)) 

# pruning 

algs [ ' CAlOprune ' ] = S J . Jet Algor i thm ( S J . For kToolChild ( CA j et s , ' CAlOprune ' ) ) 
big_CA_def = f j . Jet Def init i on ( f j . cambr idge_algor i thm , 3.14*0.5)Ch 
algs [ ' CAlOprune ' ] . addTool (SJ.FastJet. FastPruneTool (big_CA_def)) 

RunAlgor ithms ( inf ile , outfile, algs . values () , 200, N) 

The new plots in Chapters 4 and 5 were generated with similar functions, not 
given here. The input file must be in "UW" or StdHEP format (the former is a 
simple text format); the output is a SpartyJet ROOT file with all jet information 
stored, including measured values of AR, ai, and nii (the last two both look for 
the heavier subjet; ai = mi/nij). 

The W analysis compares initial anti-kx jets, jets reclustered with CA (identical 
contents but different substructure), and CA jets with pruning or mass-drop filtering 
[15j applied. The top analysis compares the same initial jets with pruned or top-tagged 
[18j jets. This analysis also includes an additional top-tagging implementation I have 
set up with a set of SpartyJet tools. This version discards asymmetric branchings 
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even for subjets that do not eventually split, so is somewhat more aggressive.^ In 
addition, my implementation finds the W subjet by unclustering the top jet to three 
subjets, then merging the pair with minimum combined mass — as in the CMS top 
tagging implementation [133J. The original implementation simply takes the pair 
with combined mass closest to niw. No attempt has been made to optimize the 
parameters of this modified top tagger; it is included as an example of a SpartyJet 
tool implementation and a foil for the other methods. 

D.2 Top quark results 

The jet mass distribution for each method is shown in Fig. D.l. The events are 
the same as in the second pt bin studied in Chapter 5, with jets have px > 500 
GeV. All three substructure methods improve on plain anti-kx jets. As expected, 
pruning removes more soft radiation than top-tagging, since pruning is applied to the 
whole jet; the result is a mass peak that is slightly higher but shifted slightly lower. 
The second implementation of top tagging is shifted even further lower but is clearly 
over-grooming — the z and AR criteria used by top tagging are both looser than 
for pruning, resulting in more vetoed mergings for the "JH2" sample. The jet mass 
windows, found as described in Sec. 5.4.1, are given in Table D.l. 

After restricting jets to lie in the mass windows given in Table D.l, we can look 
for evidence of the W mass. In Fig. D.2 we plot the found subjet mass; for the JH 
tagger we use the identified W; for the other three we take the heavier subjet. The 
results are broadly similar, with pruning giving a slightly narrow peak and "JH2" a 
shghtly wider peak than the JH tagger. Again using the methods of Sec. 5.4.1 we can 

^To illustrate the difference, consider their action on a putative top jet. Both will remove from 
the jet soft, wide-angle splittings until a top-level splitting is found. Both will then repeat this 
procedure on the two subjets. Consider then that for one subjet, several soft protojets are dis- 
carded before finding an irreducible splitting — the subjet does not split. The original top-tagger 
(at least as implemented by Gavin Salam's JHTopTagger . hh 22\) keeps an entire subjet; my im- 
plementation will discard the soft protojets and keeps only the subjet formed at the irreducible 
splitting. 



235 




Figure D.l: Distribution in mj for anti-kx jets reclustered with CA, then pruned 
or top-tagged. "CAIOJH" is the original Johns Hopkins tagger; "CA10JH2" is my 
variant. Jets have px > 500 GeV; the initial D = 1.0. 



Method 


mj7 


high 


'""subjet 


high 
"^subjet 


CAIO 


160.3 


187.9 


72.6 


84.6 


CAIO + pruning 


165.7 


178.3 


73.8 


83.8 


CAIO + JH tagger 


165.4 


180.3 


73.5 


85.1 


CAIO + JH2 tagger 


163.7 


179.6 


72.4 


85.1 



Table D.l: Jet mass and subjet mass windows for each top-finding method. 
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Figure D.2: Distribution in subjet mass for anti-kx jets reclustered with CA, then 
pruned or top-tagged. "CAIOJH" is the original Johns Hopkins tagger; "CA10JH2" 
is my variant. For the JH tagger the identified W subjet is used; for the others I take 
the heavier subjet. Jets have pt > 500 GeV; the initial D = 1.0. 

find the subjet mass windows, also given in Table D.l, 

In Fig. D.3 we give the jet and subjet mass distributions for the background sample 
(the same matched multijet as in Chapter 5, pt bin 2). Note that the JH tagger takes 
three or four subjets and merges the two closest in combined mass to mw, producing 
a peak in the background subjet mass distribution. The "minimum mass" taken in 
JH2, and the CMS implementation of the JH tagger, does not share this feature. 

The tagging efficiencies and mis-tag rates for each method are given in Table 
D.2. The efficiency (mis-tag rate) for each method is the number of jets in the signal 
(background) sample that survive after all cuts, divided by the number of initial jets 
that pass the pt cut. Only mass cuts are imposed, unlike in the original top-tagging 
analysis which also used a cut on the cosine of the helicity angle, cosOh- 
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Figure D.3: Background distribution in jet and subjet mass for anti-kx jets reclustered 
with CA, then pruned or top-tagged. "CAIOJH" is the original Johns Hopkins tagger; 
"CA10JH2" is my variant. For the JH tagger the identified W subjet is used; for the 
others I take the heavier subjet. Jets have px > 500 GeV; the initial D = 1.0. 





Signal 


Background 


Method 


mjet cut 


mjet and msubjet cuts 


mjct cut 


mjet and msubjet cuts 


CAIO 


0.49 


0.05 


0.054 


0.0018 


CAIO + pruning 


0.27 


0.11 


0.016 


0.00075 


CAIO + JH tagger 


0.27 


0.20 


0.0086 


0.0025 


CAIO + JH2 tagger 


0.23 


0.14 


0.0074 


0.0014 



Table D.2: Tagging efficiencies and mis-tag rates for each method, applied to ti events 
(Signal) and matched multi-jet events (Background). Initial jets have pt > 500 GeV 
and D = 1.0. Efficiencies are relative to initial numbers of jets passing the pt cut. 
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Figure D.4: Distribution in mj for anti-kx jets reclustered with CA, then pruned or 
mass-drop fihered. Jets have px > 200 GeV; the initial D = 1.0. 

D.3 W results 

We now turn to W finding, repeating the analysis of the previous section but this time 
comparing pruning to the mass-drop filter method. The signal jet mass distributions 
are shown in Fig. D.4, We can see that the performance of pruning is quite similar 
to the mass-drop filter method. The mass windows for each method are given in 
Table D.3, The background jet mass distributions are shown in Fig. D.5. Tagging 
and mis-tagging efficiencies are given in Table D.4, We can see that pruning and 
mass-drop filtering are both superior to plain CA, but that they are quite similar in 
performance. 
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Figure D.5: Background distribution in mj for anti-kx jets reclustered with CA, then 
pruned or mass-drop filtered. Jets have pt > 200 GeV; the initial D = 1.0. 



Method 




high 


CAIO 


69.0 


89.9 


CAIO + pruning 


71.4 


84.0 


CAIO + MDF 


71.7 


86. .4 



Table D.3: Jet mass windows for each ly- finding method. 
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Method 


Signal 


Background 


CAIO 


0.62 


0.117 


CAlO + pruning 


0.54 


0.036 


CAIO + MDF 


0.57 


0.042 



Table D.4: Tagging efficiencies and mis-tag rates for each method after a jet mass 
cut, applied to semileptonic WW events (Signal) and matched W+ jets events (Back- 
ground). Initial jets have pt > 200 GeV and D — 1.0. Efficiencies are relative to 
initial numbers of jets passing the pr cut. 
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